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PREFACE TO THE THIRD EDITION 


When we were asked to prepare the third edition of this book, it was our con- 
sensus that it should not be altered in any significant way, and that Herstein’s 
informal style should be preserved. We feel that one of the book’s virtues is 
the fact that it covers a big chunk of abstract algebra in a condensed and in- 
teresting way. At the same time, without trivializing the subject, it remains ac- 
cessible to most undergraduates. 

We have, however, corrected minor errors, straightened out inconsis- 
tencies, clarified and expanded some proofs, and added a few examples. 

To resolve the many typographical problems of the second edition, 
Prentice Hall has had the book completely retypeset—making it easier and 
more pleasurable to read. 

It has been pointed out to us that some instructors would find it useful 
to have the Symmetric Group S,, and the cycle notation available in Chapter 
2, in order to provide more examples of groups. Rather than alter the 
arrangement of the contents, thereby disturbing the original balance, we sug- 
gest an alternate route through the material, which addresses this concern. 
After Section 2.5, one could spend an hour discussing permutations and their 
cycle decomposition (Sections 3.1 and 3.2), leaving the proofs until later. The 
students might then go over several past examples of finite groups and explic- 
itly set up isomorphisms with subgroups of S,. This exercise would be moti- 
vated by Cayley’s theorem, quoted in Section 2.5. At the same time, it would 
have the beneficial result of making the students more comfortable with the 
concept of an isomorphism. The instructor could then weave in the various 
subgroups of the Symmetric Groups S,, as examples throughout the remain- 
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der of Chapter 2. If desired, one could even introduce Sections 3.1 and 3.2 
after Section 2.3 or 2.4. 
Two changes in the format have been made since the first edition. First, 
a Symbol List has been included to facilitate keeping track of terminology. 
Second, a few problems have been marked with an asterisk (*). These serve 
as a vehicle to introduce concepts and simple arguments that relate in some 
important way to the discussion. As such, they should be read carefully. 
Finally, we take this opportunity to thank the many individuals whose 
collective efforts have helped to improve this edition. We thank the review- 
ers: Kwangil Koh from North Carolina State University, Donald Passman 
from the University of Wisconsin, and Robert Zinc from Purdue University. 
And, of course, we thank George Lobell and Elaine Wetterau, and others at 
Prentice Hall who have been most helpful. 
Barbara Cortzen 
David J. Winter 


PREFACE TO THE FIRST EDITION 


In the last half-century or so abstract algebra has become increasingly impor- 
tant not only in mathematics itself, but also in a variety of other disciplines. 
For instance, the importance of the results and concepts of abstract algebra 
play an ever more important role in physics, chemistry, and computer science, 
to cite a few such outside fields. 

In mathematics itself abstract algebra plays a dual role: that of a unify- 
ing link between disparate parts of mathematics and that of a research subject 
with a highly active life of its own. It has been a fertile and rewarding research 
area both in the last 100 years and at the present moment. Some of the great 
accomplishments of our twentieth-century mathematics have been precisely 
in this area. Exciting results have been proved in group theory, commutative 
and noncommutative ring theory, Lie algebras, Jordan algebras, combina- 
torics, and a host of other parts of what is known as abstract algebra. A sub- 
ject that was once regarded as esoteric has become considered as fairly down- 
to-earth for a large cross section of scholars. 

The purpose of this book is twofold. For those readers who either want 
to go on to do research in mathematics or in some allied fields that use alge- 
braic notions and methods, this book should serve as an introduction—and, 
we stress, only as an introduction—to this fascinating subject. For interested 
readers who want to learn what is going on in an engaging part of modern 
mathematics, this book could serve that purpose, as well as provide them with 
some highly usable tools to apply in the areas in which they are interested. 

The choice of subject matter has been made with the objective of intro- 
ducing readers to some of the fundamental algebraic systems that are both in- 
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teresting and of wide use. Moreover, in each of these systems the aim has 
been to arrive at some significant results. There is little purpose served in 
studying some abstract object without seeing some nontrivial consequences of 
the study. We hope that we have achieved the goal of presenting interesting, 
applicable, and significant results in each of the systems we have chosen to 
discuss. 

As the reader will soon see, there are many exercises in the book. They 
are often divided into three categories: easier, middle-level, and harder (with 
an occasional very hard). The purpose of these problems is to allow students 
to test their assimilation of the material, to challenge their mathematical inge- 
nuity, to prepare the ground for material that is yet to come, and to be a 
means of developing mathematical insight, intuition, and techniques. Readers 
should not become discouraged if they do not manage to solve all the prob- 
lems. The intent of many of the problems is that they be tried—even if not 
solved—for the pleasure (and frustration) of the reader. Some of the prob- 
lems appear several times in the book. Trying to do the problems is undoubt- 
edly the best way of going about learning the subject. 

We have strived to present the material in the language and tone of a 
classroom lecture. Thus the presentation is somewhat chatty; we hope that 
this will put the readers at their ease. An attempt is made to give many and 
revealing examples of the various concepts discussed. Some of these exam- 
ples are carried forward to be examples of other phenomena that come up. 
They are often referred to as the discussion progresses. 

We feel that the book is self-contained, except in one section—the sec- 
ond last one of the book—where we make implicit use of the fact that a poly- 
nomial over the complex field has complex roots (that is the celebrated Fun- 
damental Theorem of Algebra due to Gauss), and in the last section where we 
make use of a little of the calculus. 

We are grateful to many people for their comments and suggestions on 
earlier drafts of the book. Many of the changes they suggested have been in- 
corporated and should improve the readability of the book. We should like to 
express our special thanks to Professor Martin Isaacs for his highly useful 
comments. 

We are also grateful to Fred Flowers for his usual superb job of typing 
the manuscript, and to Mr. Gary W. Ostedt of the Macmillan Company for 
his enthusiasm for the project and for bringing it to publication. 

With this we wish all the readers a happy voyage on the mathematical 
journey they are about to undertake into this delightful and beautiful realm 
of abstract algebra. 


I.N.H. 
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THINGS FAMILIAR 
AND LESS FAMILIAR 


1. A FEW PRELIMINARY REMARKS 


For many readers this book will be their first contact with abstract mathe- 
matics. The subject to be discussed is usually called “abstract algebra,” but 
the difficulties that the reader may encounter are not so much due to the “al- 
gebra” part as they are to the “abstract” part. 

On seeing some area of abstract mathematics for the first time, be it in 
analysis, topology, or what-not, there seems to be a common reaction for the 
novice. This can best be described by a feeling of being adrift, of not having 
something solid to hang on to. This is not too surprising, for while many of the 
ideas are fundamentally quite simple, they are subtle and seem to elude one’s 
grasp the first time around. One way to mitigate this feeling of limbo, or asking 
oneself “What is the point of all this?,” is to take the concept at hand and see 
what it says in particular cases. In other words, the best road to good under- 
standing of the notions introduced is to look at examples. This is true in all of 
mathematics, but it is particularly true for the subject matter of abstract algebra. 

Can one, with a few strokes, quickly describe the essence, purpose, and 
background for the material we shall study? Let’s give it a try. 

We start with some collection of objects S$ and endow this collection 
with an algebraic structure by assuming that we can combine, in one or sev- 
eral ways (usually two), elements of this set S to obtain, once more, elements 
of this set S. These ways of combining elements of S we call operations on S. 
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Then we try to condition or regulate the nature of S by imposing certain 
rules on how these operations behave on S. These rules are usually called the 
axioms defining the particular structure on S. These axioms are for us to de- 
fine, but the choice made comes, historically in mathematics, from noticing 
that there are many concrete mathematical systems that satisfy these rules or 
axioms. We shall study some of the basic axiomatic algebraic systems in this 
book, namely groups, rings, and fields. 

Of course, one could try many sets of axioms to define new structures. 
What would we require of such a structure? Certainly we would want that 
the axioms be consistent, that is, that we should not be led to some nonsensi- 
cal contradiction computing within the framework of the allowable things the 
axioms permit us to do. But that is not enough. We can easily set up such al- 
gebraic structures by imposing a set of rules on a set S that lead to a patho- 
logical or weird system. Furthermore, there may be very few examples of 
something obeying the rules we have laid down. 

Time has shown that certain structures defined by “axioms” play an im- 
portant role in mathematics (and other areas as well) and that certain others 
are of no interest. The ones we mentioned earlier, namely groups, rings, and 
fields, have stood the test of time. 

A word about the use of “axioms.” In everyday language “axiom” 
means a Self-evident truth. But we are not using everyday language; we are 
dealing with mathematics. An axiom is not a universal truth—but one of sev- 
eral rules spelling out a given mathematical structure. The axiom is true in 
the system we are studying because we have forced it to be true by hypothe- 
sis. It is a license, in the particular structure, to do certain things. 

We return to something we said earlier about the reaction that many 
students have on their first encounter with this kind of algebra, namely a lack 
of feeling that the material is something they can get their teeth into. Do not 
be discouraged if the initial exposure leaves you in a bit of a fog. Stick with 
it, try to understand what a given concept says, and most importantly, look at 
particular, concrete examples of the concept under discussion. 


PROBLEMS 


1. Let S be a set having an operation * which assigns an element a * b of S 
for any a, b € S. Let us assume that the following two rules hold: 
1. If a, b are any objects in S, then a * b = a. 
2. If a, b are any objects in S, thena*b = b *a. 


Show that S can have at most one object. 
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2. Let S be the set of all integers 0, +1, +2,..., +n,.... Fora, b in S define 

* by a * b =a — Db. Verify the following: 

(a) ax*b#b*aunlessa = b. 

(b) (a * b) *c # a * (b * c) in general. Under what conditions on a, b, c is 
(a*b)*c=a*(b*c)? 

(c) The integer 0 has the property that a * 0 = a for every a in S. 

(d) ForainS,a*a=0. 

3. Let S consist of the two objects | and A. We define the operation * on S 
by subjecting [_] and A to the following conditions: 
1OFA=A=A*U. 

2. J*LJI=C. 

3. A*A=C. 

Verify by explicit calculation that if a, b, c are any elements of S (i.e., a, b 
and c can be any of L] or A), then: 

(a) a* bisin S. 

(b) (a* b)*c=a*(b*c). 

(c) a*b=b*a. 

(d) There is a particular a in S such thata *b = b*a = b forall bin S. 
(e) Given b in S, then b * b = a, where ais the particular element in Part 


(d). 


2. SET THEORY 


With the changes in the mathematics curriculum in the schools in the United 
States, many college students have had some exposure to set theory. This in- 
troduction to set theory in the schools usually includes the elementary no- 
tions and operations with sets. Going on the assumption that many readers 
will have some acquaintance with set theory, we shall give a rapid survey of 
those parts of set theory that we shall need in what follows. 

First, however, we need some notation. To avoid the endless repetition 
of certain phrases, we introduce a shorthand for these phrases. Let S be a 
collection of objects; the objects of S we call the elements of S. To denote 
that a given element, a, is an element of S, we write a © S—this is read “a is 
an element of S.” To denote the contrary, namely that an object a is not an 
element of S, we write a € S. So, for instance, if S denotes the set of all posi- 
tive integers 1,2,3,...,n,..., then 165 € S, whereas —13 € S. 

We often want to know or prove that given two sets S and 7, one of 
these is a part of the other. We say that S is a subset of T, which we write 
S C T (read “S is contained in T”) if every element of S is an element of T. 
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In terms of the notation we now have: S C Tif s € S implies that s © 7. We 
can also denote this by writing T D S, read “7 contains S.” (This does not ex- 
clude the possibility that S = 7, that is, that S and T have exactly the same 
elements.) Thus, if JT is the set of all positive integers and S is the set of all 
positive even integers, then S C T, and S is a subset of 7. In the definition 
given above, S 5 S for any set S; that is, S is always a subset of itself. 

We shall frequently need to show that two sets S and T, defined per- 
haps in distinct ways, are equal, that is, they consist of the same set of ele- 
ments. The usual strategy for proving this is to show that both S C T and 
T C S. For instance, if S is the set of all positive integers having 6 as a factor 
and 7 is the set of all positive integers having both 2 and 3 as factors, then 
S = T. (Prove!) 

The need also arises for a very peculiar set, namely one having no ele- 
ments. This set is called the null or empty set and is denoted by @. It has the 
property that it is a subset of any set S. 

Let A, B be subsets of a given set S$. We now introduce methods of con- 
structing other subsets of S from A and B. The first of these is the union of A 
and B, written A U B, which is defined: A U B is that subset of S consisting 
of those elements of S that are elements of A or are elements of B. The “or” 
we have just used is somewhat different in meaning from the ordinary usage 
of the word. Here we mean that an element c is in A U B if it is in A, or Is in 
B, or is in both. The “or” is not meant to exclude the possibility that both 
things are true. Consequently, for instance, A U A = A. 

If A = {1, 2,3} and B = {2, 4, 6, 10}, then A U B = {1, 2, 3, 4, 6, 10}. 

We now proceed to our second way of constructing new sets from old. 
Again let A and B be subsets of a set S; by the intersection of A and B, writ- 
ten A ™ B, we shall mean the subset of S consisting of those elements that 
are both in A and in B. Thus, in the example above, A N B = {2}. It should 
be clear from the definitions involved that AM BCA and AN BC B. 
Particular examples of intersections that hold universally are: A 1 A = A, 
ANS=A,AN@GW= @. 

This is an opportune moment to introduce a notational device that will 
be used time after time. Given a set S, we shall often be called on to de- 
scribe the subset A of S, whose elements satisfy a certain property P. We 
shall write this as A = {s € S|s satisfies P}. For instance, if A, B are subsets 
of S,then AUB={sES|s€ AorsE€ Bhwhile ANB={sES|sEA 
ands € B}. 

Although the notions of union and intersection of subsets of S have 
been defined for two subsets, it is clear how one can define the union and in- 
tersection of any number of subsets. 

We now introduce a third operation we can perform on sets, the differ- 
ence of two sets. If A, B are subsets of S, we define A —- B= {aG Alag B}. 
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So if A is the set of all positive integers and B is the set of all even integers, 
then A — Bs the set of all positive odd integers. In the particular case when 
A is a Subset of S, the difference S — A 1s called the complement of A in S 
and is written A’. 

We represent these three operations pictorially. If A is @ and B is ®), 
then 


1ACTUB= GY iy is the shaded area. 


2ANB= is the shaded area. 

3. A-B= Gig) » is the shaded area. 
FD 

4,.B-A=(,4 ZY is the shaded area. 


Note the relation among the three operations, namely the equality 
AUB=(ANB)U (A — B) U(B — A). As an illustration of how one goes 
about proving the equality of sets constructed by such set-theoretic construc- 
tions, we pfove this latter alleged equality. We first show that (A N B) U 
(A — B)U(B—-A)CA UB; this part is easy for, by definition, AM BC A, 
A -BCA,and B - ACB, hence 


(AN B)U(A — B)U(B-— A)CAUAUB=AUB. 


\Ss 


Now for the other direction, namely that A U B C (A NM B)U (A —- B) U 
(B — A). Givenu€ AU B,ifu€ A andu € B, thenu € AN B, so it is cer- 
tainly in (A M B) U (4 — B) U (B — A). On the other hand, if u € A but 
u €& B, then, by the very definition of A — B, u € A — B, so again it is cer- 
tainly in (A M B) U (A — B) U (B — A). Finally, if u € B but u € A, then 
u © B — A, so again it is in (A M B) U (A — B) U (B — A). We have thus 
covered all the possibilities and have shown that A U B C (A 1M B) U 
(A — B) U (B — A). Having the two opposite containing relations of A U B 
and (A  B) U (A — B) U (B — A), we obtain the desired equality of these 
two sets. 


We close this brief review of set theory with yet another construction 
we can carry out on sets. This is the Cartesian product defined for the two 
sets A, B by A X B = {(a, b)|a € A, b € B}, where we declare the ordered 
pair (a, b) to be equal to the ordered pair (a,, b,) if and only if a = a, and 
b = b,. Here, too, we need not restrict ourselves to two sets; for instance, we 
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can define, for sets A, B, C, their Cartesian product as the set of ordered 
triples (a, b, c), where a © A, b © B, c € C and where equality of two or- 
dered triples is defined component-wise. 


PROBLEMS 
Easier Problems 


1. Describe the following sets verbally. 

(a) S = {Mercury, Venus, Earth, ..., Pluto}. 

(b) S = {Alabama, Alaska, ..., Wyoming}. 
2. Describe the following sets verbally. 

(a) S = {2, 4, 6, 8,...}. 

(b) S = {2, 4, 8, 16, 32,...}. 

(c) S = {1, 4, 9, 16, 25, 36, .. .}. 

3. If A is the set of all residents of the United States, B the set of all Cana- 
dian citizens, and C the set of all women in the world, describe the sets 
ANBNC,A — B,A—-C,C— A verbally. 

4. If A = {1, 4, 7, a} and B = {3, 4, 9, 11} and you have been told that 

AN B= {4, 9}, what must a be? 

.IfA C Band BCC, prove that A CC. 

. If A C B, prove that AU CC B U C for any set C. 

- Show thatA UB=BUAandANB=BNA. 


. Prove that (A — B) U(B — A) = (A U B) — (A NB). What does this 
look like pictorially? 


9. Prove that AN (BUC)=(ANB)U(ANC). 
10. Prove that A U(BN C)=(AUB)N(AUC). 
11. Write down all the subsets of S = {1, 2, 3, 4}. 


Oon~rN oN 


Middle-Level Problems 


*12. If C is a subset of S, let C’ denote the complement of C in S. Prove the 
De Morgan Rules for subsets A, B of S, namely: 
(a) (AN B)'=A' UB". 
(b) (AUB) =A'NB’. 

*13. Let S be a set. For any two subsets of S we define 


A+B=(A-B)U(B-—A) and A-B=ANB. 
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*14, 


15. 


16. 


17. 


18. 


19. 


*20. 
21. 


Prove that: 

(a) A+ B=B++A. 
(b) A+ O=A. 

(c) A-A =A. 

(d) A+tA=@. 


(ec) A+(B+C)=(A+B)+C. 

(f) If A+ B=A+C,then B=C. 

(g) A‘(B+C)=A-Bt+A-C. 

If C is a finite set, let m(C) denote the number of elements in C. If A, B 
are finite sets, prove that 


m(A U B) = m(A) + m(B) — m(AN B). 


For three finite sets A, B, C find a formula for m(A U B U C). (Hint: 
First consider D = B U C and use the result of Problem 14.) 

Take a shot at finding m(A,; U A,U---UA,,) for n finite sets A,, A,,..., 
A,- 

Use the result of Problem 14 to show that if 80% of all Americans have 
gone to high school and 70% of all Americans read a daily newspaper, 
then at least 50% of Americans have both gone to high school and read a 
daily newspaper. 

A public opinion poll shows that 93% of the population agreed with the 
government on the first decision, 84% on the second, and 74% on the 
third, for three decisions made by the government. At least what per- 
centage of the population agreed with the government on all three deci- 
sions? (Hint: Use the results of Problem 15.) 


In his book A Tangled Tale, Lewis Carroll proposed the following riddle 
about a group of disabled veterans: “Say that 70% have lost an eye, 75% 
an ear, 80% an arm, 85% a leg. What percentage, at least, must have lost 
all four?” Solve Lewis Carroll’s problem. 

Show, for finite sets A, B, that m(A X B) = m(A)m(B). 

If S is a set having five elements: 

(a) How many subsets does S have? 

(b) How many subsets having four elements does S have? 

(c) How many subsets having two elements does S have? 


Harder Problems 


22. 


(a) Show that a set having n elements has 2” subsets. 
(b) If 0<m<xn, how many subsets are there that have exactly m ele- 
ments? 
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3. MAPPINGS 


One of the truly universal concepts that runs through almost every phase of 
mathematics is that of a function or mapping from one set to another. One 
could safely say that there is no part of mathematics where the notion does 
not arise or play a central role. The definition of a function from one set to 
another can be given in a formal way in terms of a subset of the Cartesian 
product of these sets. Instead, here, we shall give an informal and admittedly 
nonrigorous definition of a mapping (function) from one set to another. 

Let S, T be sets; a function or mapping f from S to T is a rule that as- 
signs to each element s € S a unique element t € T. Let’s explain a little 
more thoroughly what this means. If s is a given element of S, then there is 
only one element t¢ in T that is associated to s by the mapping. As s varies 
over S, ¢ varies over 7 (in a manner depending on s). Note that by the defini- 
tion given, the following is not a mapping. Let S be the set of all people in 
the world and T the set of all countries in the world. Let f be the rule that as- 
signs to every person his or her country of citizenship. Then f is not a map- 
ping from S to 7. Why not? Because there are people in the world that enjoy 
a dual citizenship; for such people there would not be a unique country of cit- 
izenship. Thus, if Mary Jones is both an English and French citizen, f would 
not make sense, as a mapping, when applied to Mary Jones. On the other 
hand, the rule f: R — R, where R is the set of real numbers, defined by 
f(a) = a’ for a € R, is a perfectly good function from R to R. It should be 
noted that f(—2) = (—2)° = 4 = f(2), and f(—a) = f(a) forallaE R. 

We denote that f is a mapping from S to T by f: S — T and for the 
t © T mentioned above we write t = f(s); we call ¢ the image of s under f. 

The concept is hardly a new one for any of us. Since grade school we 
have constantly encountered mappings and functions, often in the form of 
formulas. But mappings need not be restricted to sets of numbers. As we see 
below, they can occur in any area. 


Examples 


1. Let S = {all men who have ever lived} and T = {all women who have ever 
lived}. Define f: S —~ T by f(s) = mother of s. Therefore, f(John F. Ken- 
nedy) = Rose Kennedy, and according to our definition, Rose Kennedy is 
the image under f of John F. Kennedy. 

2. Let S = {all legally employed citizens of the United States} and T = {posi- 
tive integers}. Define, for s € S, f(s) by f(s) = Social Security Number of s. 
(For the purpose of this text, let us assume that all legally employed citizens 
of the United States have a Social Security Number.) Then f defines a map- 
ping from S to T. 
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3. Let S be the set of all objects for sale in a grocery store and let T = {all 
real numbers}. Define f: S — T by f(s) = price of s. This defines a mapping 
from S to T. 
4. Let S be the set of all integers and let T = S. Define f: S — T by f(m) = 
2m for any integer m. Thus the image of 6 under this mapping, f(6), is given 
by f(6) = 2-6 = 12, while that of —3, f(—3), is given by f(—3) = 2(—-3) = 
—6. Ifs,;,5, € S are in S and f(s,) = f(s2), what can you say about s, and s,? 
5. Let S = T be the set of all real numbers; define f: S > T by f(s) = s’. 
Does every element of T come up as an image of some s € S? If not, how 
would you describe the set of all images { f(s)|s € S}? When is f(s,) = 
f (S82)? 
6. Let S = T be the set of all real numbers; define f: S > T by f(s) = s°. This 
is a function from S to T. What can you say about { f(s)|s € S}? When is 
f (81) = f(S2)? 
7. Let T be any nonempty set and let S = 7 X T, the Cartesian product of T 
with itself. Define f: T x T — T by f(¢,, tf.) = t,. This mapping from T X T 
to T is called the projection of T X T onto its first component. 
8. Let S be the set of all positive integers and let 7 be the set of all positive 
rational numbers. Define f: S X S — T by f(m, n) = m/n. This defines a 
mapping from S X S to T. Note that f(1, 2) = 3 while f(3, 6) = 2 = 3 = 
f(A, 2), although (1, 2) # (3, 6). Describe the subset of S < S consisting of 
those (a, b) such that f(a, b) = 3. 

The mappings to be defined in Examples 9 and 10 are mappings that 
occur for any nonempty sets and play a special role. 
9. Let S, T be nonempty sets, and let ¢, be a fixed element of 7. Define 
f:S — T by f(s) = ty for every s © S; fis called a constant function from 
S to T. 


10. Let S be any nonempty set and define 1: S — S by i(s) = s for every 
s € S. We call this function of S to itself the identity function (or identity map- 
ping) on S. We may, at times, denote it by i, (and later in the book, by e). 


Now that we have the notion of a mapping we need some way of identi- 
fying when two mappings from one set to another are equal. This is not 
God given; it is for us to decide how to declare f = g where f: S — T and 
g:S— T. What is more natural than to define this equality via the actions of 
f and g on the elements of S? More precisely, we declare that f = g if and 
only if f(s) = g(s) for every s € S. If S is the set of all real numbers and f is 
defined on S by f(s) = s? + 2s + 1, while g is defined on S by g(s) = 
(s + 1)’, our definition of the equality of f and g is merely a statement of the 
familiar identity (s + 1)? = s* + 2s + 1. 


10 Things Familiar and Less Familiar Ch. 1 


Having made the definition of equality of two mappings, we now want 
to single out certain types of mappings by the way they behave. 


Definition. The mapping f: S — T 1s onto or surjective if every t © T 
is the image under f of some s € S; that is, if and only if, given ¢ € 7, there 
exists ans € S such that t = f(s). 


In the examples we gave earlier, in Example 1 the mapping is not onto, 
since not every woman that ever lived was the mother of a male child. Simi- 
larly, in Example 2 the mapping is not onto, for not every positive integer is 
the Social Security Number of some U.S. citizen. The mapping in Example 4 
fails to be onto because not every integer is even; and in Example 5, again, 
the mapping is not onto, for the number —1, for instance, is not the square of 
any real number. However, the mapping in Example 6 1s onto because every 
real number has a unique real cube root. The reader can decide whether or 
not the given mappings are onto in the other examples. 

If we define f(S) = { f(s) € T|s € S}, another way of saying that the 
mapping f: S — T is onto is by saying that f(S) = T. 

Another specific type of mapping plays an important and particular 
role in what follows. 


Definition. A mapping f: S — T is said to be one-to-one (written 1-1) 
or injective if for s,; # s, in S, f(s,) # f(s2) in T. Equivalently, f is 1-1 if 
f(s) == f(S2) implies that 5S, = S>. 


In other words, a mapping is 1-1 if it takes distinct objects into distinct 
images. In the examples of mappings we gave earlier, the mapping of Example 
1 is not 1-1, since two brothers would have the same mother. However in Ex- 
ample 2 the mapping is 1-1 because distinct U.S. citizens have distinct Social 
Security numbers (provided that there is no goof-up in Washington, which is 
unlikely). The reader should check if the various other examples of mappings 
are 1-1. 

Given a mapping f: S — T and a subset A C 7, we may want to look at 
B = {s € S| f(s) € A}; we use the notation f~'(A) for this set B, and call 
F7!(A) the inverse image of A under f. Of particular interest is f~'(r), the in- 
verse image of the subset {t} of 7 consisting of the element t € T alone. If 
the inverse image of {t} consists of only one element, say s € S, we could try 
to define f~'(t) by defining f~'(t) = s. As we note below, this need not be a 
mapping from T to S, but is so if fis 1-1 and onto. We shall use the same no- 
tation f-' in cases of both subsets and elements. This f~' does not in general 
define a mapping from T to S for several reasons. First, if fis not onto, then 
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there is some ¢ in T which is not the image of any element s, so f ‘(t) = ©. 
Second, if f is not 1-1, then for some ¢t € T there are at least two distinct 
s,; # 5, in S such that f(s,;) = t = f(s). So f~'(£) is not a unique element of 
S—something we require in our definition of mapping. However, if fis both 
1-1 and onto T, then f~! indeed defines a mapping of T onto S. (Verify!) This 
brings us to a very important class of mappings. 


Definition. The mapping f: S — T is said to be a 1-1 correspondence 
or bijection if f is both 1-1 and onto. 


Now that we have the notion of a mapping and have singled out various 
types of mappings, we might very well ask: “Good and well, but what can we 
do with them?” As we shall see in a moment, we can introduce an operation 
of combining mappings in certain circumstances. 

Consider the situation g: S — T and f: T — U. Given an elements € S, 
then g sends it into the element g(s) in T; so g(s) is ripe for being acted on 
by f Thus we get an element f(g(s)) © U. We claim that this procedure pro- 
vides us with a mapping from S to U. (Verify!) We define this more formally 
in the 


Definition. If g:S— T and f: T — U, then the composition (or prod- 
uct), denoted by fog, is the mapping fog: S — U defined by (f°g)(s) = 
f(g(s)) for every s € S. 


Note that to compose the two mappings f and g—that 1s, for fog to 
have any sense—the terminal set, T, for the mapping g must be the initial set 
for the mapping f. One special time when we can always compose any two 
mappings is when S = T = U, that is, when we map S into itself. Although 
special, this case is of the utmost importance. 

We verify a few properties of this composition of mappings. 


Lemma 1.3.1. Ifh:S—T,g:7T—U,and f:U— V, then fe(geoh) = 
(fog) ch. 

Proof. How shall we go about proving this lemma? To verify that two 
Mappings are equal, we merely must check that they do the same thing to 
every element. Note first of all that both f° (g°h) and (f° g) °h define map- 
pings from S to V, so it makes sense to speak about their possible equality. 

Our task, then, is to show that for every s € S, (f°(g°h))(s) = 
((f° g)°h)(s). We apply the definition of composition to see that 


(fe(geh))(s) = (ge hy(s)) = F(g(A(s))). 
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Unraveling 


((feg)chy(s) = (fe sy(Als)) = fl(g(h(s))), 


we do indeed see that 


(fe(geh))(s) = (feg)ehy(s) 
for every s € S. Consequently, by definition, f° (g°h) = (f°g)°h.U 


(The symbol LJ will always indicate that the proof has been completed.) 

This equality is described by saying that mappings, under composition, 
satisfy the associative law. Because of the equality involved there is really no 
need for parentheses, so we write f°(g°h) as f° geh. 


Lemma 1.3.2. If g:S— 7 and f: T— U are both 1-1, then feg:S—~U 
is also 1-1. 


Proof. Let us suppose that (f° g)(s,;) = (f° g)(s2); thus, by definition, 
f(g(s1)) = f(g(s2)). Since f is 1-1, we get from this that g(s,) = g(s,); how- 
ever, g is also 1-1, thus s,; = s, follows. Since (f° g)(s,) = (f° g)(S2) forces 
S; = S>, the mapping f° g is 1-1. 0 


We leave the proof of the next Remark to the reader. 


Remark. If g:S — T and f: T — U are both onto, then fog:S — U is 
also onto. 


An immediate consequence of combining the Remark and Lemma 
1.3.2 is to obtain 


Lemma 1.3.3. If g:S — T and f:T — U are both bijections, then 
feg:S— Uis also a bijection. 


If f is a 1-1 correspondence of S onto T, then the “object” f-': T—> S$ 
defined earlier can easily be shown to be a 1-1 mapping of 7 onto S. In this 
case it is called the inverse of f. In this situation we have 


Lemma 1.3.4. If f: S > T is a bijection, then fe f-' =i; andf '°of= 
is, where i; and i; are the identity mappings of S and 7, respectively. 


Proof. We verify one of these. If t € T, then (fe f~')(t) = f(f-'(0). 
But what is f~'(t)? By definition, f~'(t) is that element s, € S such that 
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t = f(s). So f(f-'() = f(so) = t. In other words, (fe f~')(t) = t for every 
t © T; hence f° f~! = i, the identity mapping on T. 0 


We leave the last result of this section for the reader to prove. 


Lemma 1.3.5. If f: S — T and i; is the identity mapping of T onto it- 
self and i, is that of S onto itself, then ir° f = fand fei, = f. 


PROBLEMS 
Easier Problems 


1. For the given sets S, T determine if a mapping f: S — T is clearly and un- 
ambiguously defined; if not, say why not. 
(a) S = set of all women, T = set of all men, f(s) = husband of s. 
(b) S = set of positive integers, T = S, f(s) = s — 1. 
(c) S = set of positive integers, T = set of nonnegative integers, f(s) = 
SS 
(d) S = set of nonnegative integers, T= S, f(s) =s — 1. 
(e) S = set of all integers, T = S, f(s) = s — 1. 
(f) S ="set of all real numbers, T = S, f(s) = Vs. 
(g) S = set of all positive real numbers, JT = S, f(s) = Vs. 
2. In those parts of Problem 1 where f does define a function, determine if 
it is 1-1, onto, or both. 
*3. If fis a 1-1 mapping of S onto T, prove that f~’ is a 1-1 mapping of T 
onto S. 
*4, If fis a 1-1 mapping of S onto T, prove that f~'° f = is. 
5. Give a proof of the Remark after Lemma 1.3.2. 
*6. If f: S— Tis onto and g: T— Uandh: T > Uare such that g°f=hoef, 
prove that g = h. 
*7, Ifg:S > T,h:S — T, and if f: T > U's 1-1, show that if fog = feh, 
then g = h. 
8. Let S be the set of all integers and T = {1, —1}; f: S — T is defined by 
f(s) = 1 ifs is even, f(s) = —1 ifs is odd. 
(a) Does this define a function from S to T? 
(b) Show that f(s, + s.) = f(s,)f(s2). What does this say about the inte- 
gers? 
(c) Is f(5152) = f(5;) f(s) also true? 
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9. Let S be the set of all real numbers. Define f: S > S by f(s) = s*, and 
g:S—>Sbyg(s)=s +1. 
(a) Find fe g. 
(b) Find gef. 
(c) Isfog =gef? 

10. Let S be the set of all real numbers and for a, b € S, where a # 0; define 
f(s) = as + b. 
(a) Show that f,,° fa = fu.y for some real u, v. Give explicit values for 

u, v in terms of a, b, c, and d. 

(b) Is fan ° fea = fea ° fa,» always? 
(c) Find all f,, such that f,,°ft1 =fia°ehio- 
(d) Show that f;;, exists and find its form. 

11. Let S be the set of all positive integers. Define f:S — S by f(1) = 2, 
f(2) = 3, f(3) = 1, and f(s) = s for any other s € S. Show that fe feof = 
iy. What is f~! in this case? 


Middle-Level Problems 


12. Let S be the set of nonnegative rational numbers, that is, S = {m/n| m,n 
nonnegative integers, n # 0}, and let T be the set of all integers. 

(a) Does f: S — T defined by f(m/n) = 23” define a legitimate function 
from S to T? 

(b) If not, how could you modify the definition of f so as to get a legiti- 
mate function? 

13. Let S be the set of all positive integers of the form 23", where m > 0, 
n > 0, and let T be the set of all rational numbers. Define f: S — T by 
f(2"3") = min. Prove that f defines a function from S to T. (On what 
properties of the integers does this depend?) 

14. Let f: S — S, where S is the set of all integers, be defined by f(s) = 
as + b, where a, b are integers. Find the necessary and sufficient condi- 
tions on a, b in order that f° f = Js. 

15. Find all f of the form given in Problem 14 such that fe f° f = is. 

16. If fis a 1-1 mapping of S onto itself, show that (f~')7' = f,. 

17. If S is a finite set having m > 0 elements, how many mappings are there 
of S into itself? 

18. In Problem 17, how many 1-1 mappings are there of S into itself? 

19. Let S be the set of all real numbers, and define f:S — S by f(s) = 
s* + as + b, where a, b are fixed real numbers. Prove that for no values 
ata, b can f be onto or 1-1. 
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20. 


21. 


22. 


Let S be the set of all positive real numbers. For positive reals a, c and 
nonnegative reals b, d, is it ever possible that the mapping f: S — S de- 
fined by f(s) = (as + b)/(cs + d) satisfies f° f = is? Find all such a, b,c, d 
that do the trick. 

Let S be the set of all rational numbers and let f,,,: S — S be defined by 
fap(s) = as + b, where a # 0, b are rational numbers. Find all f, 7 of this 
form satisfying f. 4° foo = foo ° fe, a for every fa y. 

Let S be the set of all integers and a, b, c rational numbers. Define 
f:S— S by f(s) = as* + bs + c. Find necessary and sufficient conditions 
on a, b, c, so that f defines a mapping on S [Note: a, b, c need not be inte- 
gers; for example, f(s) = $s(s + 1) = $s* + $s does always give us an in- 
teger for integral s.| 


Harder Problems 


23. 


24. 


25. 


26. 
27. 


28. 


29. 


30. 


31. 


Let S be the set of all integers of the form 23”, m = 0, n = 0, and let T 
be the set of all positive integers. Show that there is a 1-1 correspondence 
of S onto T. 

Prove that there is a 1-1 correspondence of the set of all positive integers 
onto the-set of all positive rational numbers. 

Let S be the set of all real numbers and 7 the set of all positive reals. 
Find a 1-1 mapping f of S onto T such that f(s, + s,) = f(s,) f(s2) for all 
51,5,ES. 

For the fin Problem 25, find f~' explicitly. 

If f, g are mappings of S into S and f° g is a constant function, then 

(a) What can you say about f if g is onto? 

(b) What can you say about g if fis 1-1? 

If S is a finite set and fis a mapping of S onto itself, show that f must be 
1-1. 

If S is a finite set and fis a 1-1 mapping of S into itself, show that f must 
be surjective. 


If S is a finite set and fis a 1-1 mapping of S, show that for some integer 
n> 0, 
is hs Re 
n times 


If S has m elements in Problem 30, find an n > 0 (in terms of m) that 
works simultaneously for all 1-1 mappings of S into itself. 
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4. A(S) (THE SET OF 1-1 MAPPINGS OF S ONTO ITSELF) 


We focus our attention in this section on particularly nice mappings of a non- 
empty set, S, into itself. Namely, we shall consider the set, A(S), of all 1-1 map- 
pings of S onto itself. Although most of the concern in the book will be in the 
case in which S is a finite set, we do not restrict ourselves to that situation here. 

When S has a finite number of elements, say n, then A(S) has a special 
name. It is called the symmetric group of degree n and is often denoted by S,,. 
Its elements are called permutations of S. If we are interested in the structure 
of S,,, it really does not matter much what our underlying set S is. So, you 
can think of S as being the set {1, ... , m}. Chapter 3 will be devoted to a 
study, in some depth, of S,,. In the investigation of finite groups, S,, plays a 
central role. 

There are many properties of the set A(S) on which we could concen- 
trate. We have chosen to develop those aspects here which will motivate the 
notion of a group and which will give the reader some experience, and feel- 
ing for, working in a group-theoretic framework. Groups will be discussed in 
Chapter 2. 

We begin with a result that is really a compendium of some of the re- 
sults obtained in Section 3. 


Lemma 1.4.1. A(S) satisfies the following: 


(a) f,g © A(S) implies that fog € A(S). 

(b) f,g,h © A(S) implies that (f° g)°h = fo(geh). 

(c) There exists an element—the identity mapping i—such that fei = 
i° f = f for every f € A(S). 

(d) Given f € A(S), there exists a g € A(S) (g = f~') such that fog = 
gofai. 
Proof. All these things were done in Section 3, either in the text mate- 


rial or in the problems. We leave it to the reader to find the relevant part of 
Section 3 that will verify each of the statements (a) through (d). 1 


We should now like to know how many elements there are in A(S) 
when S is a finite set having n elements. To do so, we first make a slight di- 
gression. 

Suppose that you can do a certain thing in r different ways and a sec- 
ond independent thing in s different ways. In how many distinct ways can 
you do both things together? The best way of finding out is to picture this in 
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a concrete context. Suppose that there are r highways running from Chicago 
to Detroit and s highways running from Detroit to Ann Arbor. In how many 
ways can you go first to Detroit, then to Ann Arbor? Clearly, for every road 
you take from Chicago to Detroit you have s ways of continuing on to Ann 
Arbor. You can start your trip from Chicago in r distinct ways, hence you 
can complete it in 


Soe Se Sr aS 
r times 


different ways. 

It is fairly clear that we can extend this from doing two independent 
things to doing m independent ones, for an integer m > 2. If we can do the 
first things in 7, distinct ways, the second in r, ways, ... , the mth in r,, dis- 
tinct ways, then we can do all these together in r,r,...r,, different ways. 

Let’s recall something many of us have already seen: 


Definition. If is a positive integer, then n! (read “n factorial’) is de- 
fined by n! =1-2-3---n. 


Lemma 1.4.2. If S has n elements, then A(S) has n! elements. 


Proof Let f € A(S), where S = {x,, xX2,...,X,}. How many choices 
does f have as a place to send x,? Clearly n, for we can send x, under f to any 
element of S. But now f is not free to send x, anywhere, for since fis 1-1, we 
must have f(x,) # f(x,). So we can send x, anywhere except onto f(x,). 
Hence f can send x, into n — 1 different images. Continuing this way, we see 
that f can send x; into n — (i — 1) different images. Hence the number of 
such f’s is n(n — 1)(n — 2)---1l=n!O 


Example 


The number n! gets very large quickly. To be able to see the picture in its en- 
tirety, we look at the special case n = 3, where n! is still quite small. 

Consider A(S) = $3, where S consists of the three elements x,, x,, x3. We 
list all the elements of S;, writing out each mapping explicitly by what it does 
to each of x1, x>, x3. 


LX TP MeO? hos NG As 
fi X1 X7,X_ 7X3, X3 > X). 


§& +X, 7X2, X27 X1, X37 X33. 


AYN Pp 


. £°f IX, PX, X27 > X53, X3 > X>. (Verify!) 
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5. f° gtx, > X3,X2 > X2,X3 > x,. (Verify!) 


6. fof :X1 > X53, X27 X1,X3 > Xp. (Verify!) 


Since we have listed here six different elements of S;, and S; has only 
six elements, we have a complete list of all the elements of S;. What does 
this list tell us? To begin with, we note that f° g # g°f, so one familiar rule 
of the kind of arithmetic we have been used to is violated. Since g € S, and 
g © S3, we must have g°g also in S,. What is it? If we calculate g° g, we 
easily get g° g = i. Similarly, we get 


Uf ogo sf es) StS (ees) (es). 


Note also that fe(f°f) = i, hence f-'! = feof. Finally, we leave it to the 
reader to show that g°f =f !og. 


It is a little cumbersome to write this product in A(S) using the °. From 
now on we Shall drop it and write f° g merely as fg. Also, we shall start using 
the shorthand of exponents, to avoid expressions like f° f° f°:--° f. We de- 
fine, for f € A(S), f° = i, f? = fof = ff, and so on. For negative exponents 
—n we define f~" by f-" = (f~')", where n is a positive integer. The usual 
rules of exponents prevail, namely f’f* = f’** and (f’)*’ = f”. We leave 
these as exercises— somewhat tedious ones at that—for the reader. 


Example 

Do not jump to conclusions that all familiar properties of exponents go over. 
For instance, in the example of the f, g € S; defined above, we claim that 
(fe) # f*g*. To see this, we note that 


fg 2X1 > X3, Xz > Xz, X3 > Ky, 


so that (fg)? :x, > x1, X2 > X2, X3; > X3, that is, (fg) = i. On the other 
hand, f* # i and g* = i, hence f’g” = f” # i, whence ( fg)’ # fg’ in this case. 


However, some other familiar properties do go over. For instance, if 
f, g, h are in A(S) and fg = fh, then g = h. Why? Because, from fg = fh we 
have f~'( fg) = f-'(fh); therefore, g = ig = (f-'f)g = f-'(fg) =f (fh) = 
(f-'f)h = ih = h. Similarly, gf = hf implies that g = h. So we can cancel an 
element in such an equation provided that we do not change sides. In S3 our 
f, g satisfy gf = f~'g, but since f # f~' we cannot cancel the g here. 


PROBLEMS 


Recall that fg stands for f° g and, also, what f” means. S, without subscripts, 
will be a nonempty set. 
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1. If s, #5, are in S, show that there is an f € A(S) such that f(s,) = s». 
2. Ifs, € S, let H = {f € A(S) | f(s,) = 5}. Show that: 
(a) i € AH. 
(b) If f, g € H, then fg € H. 
(c) If fE H, then f-' € H. 
3. Suppose that s, # s, are in S and f(s,) = s,, where f € A(S). Then if H is 
as in Problem 2 and K = {g € A(S) | g(s,) = 52}, show that: 
(a) Ifg © K, then f ‘gfe H. 
(b) If h € H, then there is some g € K such that h = f“' gf. 
4. If f,g,h € A(S), show that (f ‘gf)(f ‘hf) = f~'(gh)f. What can you say 
about (f~'gf)”? 
5. If f, g © A(S) and fg = gf, show that: 
(a) (fg) = fg’. 
(b) (fg)! = fg. 
6. Push the result of Problem 5, for the same f and g, to show that ( fg)” = 
fg” for all integers m. 


*7. Verify. the rules of exponents, namely ff’ =f’ and (f’)’ = f” for 
f € A(S) and positive integers /, s. 
8. If f, g © A(S) and (fg)* = fg’, prove that fg = ef 
9. If S = {x1, x2, x3, x4}, let f. g © S4 be defined by 


fi X > Xz, Xp > Xz, Xz > Ag, X4 > HX), 
and 


§ »X1 > Xq, X2 > Xy, Xz X3, X4 > Xy- 


Calculate: 
aff. f. 
(b) g, 8°. 
(c) fg. 
(d) gf. 
(e) (fg)°, (gf). 
OF e 
10. If f € S3, show that f° = i. 
11. Can you find a positive integer m such that f” = i for all fE S,? 
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Middle-Level Problems 


*12. 


*13. 
14. 


15. 


16. 


17. 


18. 


19. 


20. 
21. 


If f € S,, show that there is some positive integer k, depending on f, such 
that f* = i. (Hint: Consider the positive powers of f.) 

Show that there is a positive integer ¢ such that f‘ = i for all f € S,. 

If m <n, show that there is a 1-1 mapping F: S,, — S,, such that F( fg) = 
F( f)F(g) for all f. g € S,,. 

If S has three or more elements, show that we can find f, g © A(S) such 
that fg # ef. 

Let S be an infinite set and let M C A(S) be the set of all elements 
f © A(S) such that f(s) # s for at most a finite number of s € S. Prove 
that: 

(a) f, g © M implies that fg € M. 

(b) f € M implies that f-' € M. 

For the situation in Problem 16, show, if f € A(S), that f-'Mf = 
{f-'gf|g€ M) must equal M. 

Let S D T and consider the subset U(T) = {f € A(S) | f(t) € T for every 
t € T}. Show that: 

(a) 1€ U(T). 

(b) f, g © U(T) implies that fg © U(T). 

If the S in Problem 18 has n elements and T has m elements, how 
many elements are there in U(T)? Show that there is a mapping 
F: U(T) — S,, such that F( fg) = F(f)F(g) for f, g © U(T) and F is onto 
Sek 

If m <n, can Fin Problem 19 ever be 1-1? If so, when? 

In S, show that the mapping f defined by 


VX as OG Bay yXn—1 7 Xn Xn — Xy 


[ie., f(x;) = Xj4, if i <n, f(x,) = x,| can be written as f = g12.°°: B,-1 
where each g; € S,, interchanges exactly two elements of S = {x,, ...,X,}, 
leaving the other elements fixed in S. 


Harder Problems 


22. 
mas 


24. 


If f€ S,, show that f = h,h,---h,, for some h; € S,, such that h7 = i. 
Call an element in S,, a transposition if it interchanges two elements, leav- 
ing the others fixed. Show that any element in S, is a product of transpo- 
sitions. (This sharpens the result of Problem 22.) 

If n is at least 3, show that for some fin S,, f cannot be expressed in the 
form f = g° for any gin S,. 
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25. If f € S, is such that f # i but f° = i, show that we can number the 
elements of S in such a way that f(x,) = x,, f(x) = x3, f(x3) = x1, 
f(x4) = x5, f(Xs) = Xe, f(X6) = Xa, --- F3e41) = Xan +29 fX3x+2) = Xan43, 
f(%3x43) = X3,41 for some k, and, for all the other x, € S, f(x,) = x;. 

26. View a fixed shuffle of a deck of 52 cards as a 1-1 mapping of the deck 
onto itself. Show that repeating this fixed shuffle a finite (positive) num- 
ber of times will bring the deck back to its original order. 

*27. If f © A(S), call, for s € S, the orbit of s (relative to f) the set O(s) = 
{ f/(s) | all integers j}. Show that if s, t € S, then either O(s) N O(t) = @ or 
O(s) = O(0). 

28. If S = {x,, xX2,..., X42} and f € Sj, is defined by f(x,) = x4, ifi = 1,2,..., 
11 and f(x,.) = x;,, find the orbits of all the elements of S (relative to f). 

29. If f € A(S) satisfies f* = i, show that the orbit of any element of S has 
one or three elements. 

*30. Recall that a prime number is an integer p>1 such that p cannot be fac- 
tored as a product of smaller positive integers. If f € A(S) satisfies f? = 1, 
what can you say about the size of the orbits of the elements of S relative to 
f? What property of the prime numbers are you using to get your answer? 


31. Prove that if S has more than two elements, then the only elements fp in 

A(S) such that fof = ffo for all f € A(S) must satisfy fy = i. 
*32, We say that g € A(S) commutes with f € A(S) if fg = gf. Find all 

the elements in A(S) that commute with f:S — S defined by 
f (4) = X2, f (x2) = x1, and f(s) = sifs #x1,x>. 

33. In S,, show that the only elements commuting with f defined by f(x;) = 
X;4, if i <n, f(x,) = x1, are the powers of f, namely i = f°, f, f?,...,f?—. 

34, For f € A(S), let C(f) = {g € A(S) | fg = gf}. Prove that: 
(a) g,h € C(f) implies that gh © C(f). 
(b) g © C(f) implies that g~' € C(f). 
(c) C(f) is not empty. 


5. THE INTEGERS 


The mathematical set most familiar to everybody is that of the positive inte- 
gers 1,2,..., which we shall often call N. Equally familiar is the set, Z, of all 
integers—positive, negative, and zero. Because of this acquaintance with Z, 
we shall give here a rather sketchy survey of the properties of Z that we shall 
use often in the ensuing material. Most of these properties are well known to 
all of us; a few are less well known. 

The basic assumption we make about the set of integers is the 
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Well-Ordering Principle. Any nonempty set of nonnegative integers 
has a smallest member. 


More formally, what this principle states is that given a nonempty set V 
of nonnegative integers, there is an element vu, € V such that vy = v for every 
vu & V. This principle will serve as the foundation for our ensuing discussion 
of the integers. 

The first application we make of it is to show something we all know 
and have taken for granted, namely that we can divide one integer by an- 
other to get a remainder that is smaller. This is known as Euclid’s Algorithm. 
We give it a more formal statement and a proof based on well-ordering. 


Theorem 1.5.1 (Euclid’s Algorithm). If m and n are integers with 
n > 0, then there exist integers g and r, with 0 = r <n, such that m = qn + r. 


Proof. Let W be the set of m — tn, where ¢ runs through all the 
integers, i.ec., W = {m — tn|t € Z}. Note that W contains some nonnegative 
integers, for if ¢ is large enough and negative, then m — in > O. Let 
V = {v € W|v = 0}; by the well-ordering principle V has a smallest element, 
r. Since r € V,r 2 0, and r = m — qn for some q (for that is the form of all 
elements in W D V). We claim that r < n. If not, r = m — qn =n, hence 
m — (q + 1)n = 0. But this puts m — (q + 1)nin V, yet m — (q+ 1)n<y, 
contradicting the minimal nature of r in V. With this, Euclid’s Algorithm is 
proved. L] 


Euclid’s Algorithm will have a host of consequences for us, especially 
about the notion of divisibility. Since we are speaking about the integers, 
be it understood that all letters used in this section will be integers. This will 
save a lot of repetition of certain phrases. 


Definition. Given integers m # 0 and n we say that m divides n, writ- 
ten as m|n, if n = cm for some integer c. 


Thus, for instance, 2 | 14, (—7) | 14, 4| (—16). If m|n, we call m a divi- 
sor or factor of n, and n a multiple of m. To indicate that m is not a divisor of 
n, we write m|n; so, for instance, 3} 5. 

The basic elementary properties of divisibility are laid out in 


Lemma 1.5.2. The following are true: 


(a) 1|n for all n. 
(b) If m # 0, then m | 0. 
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(c) Ifm|nandn|q, then m| q. 

(d) Ifm|nandm|q, then m| (un + vq) for all u, v. 
(e) Ifm 
(f) Ifm|nandn|m, then m = =n. 


1,then m = lorm = -1. 


Proof. The proofs of all these parts are easy, following immediately 
from the definition of m|n. We leave all but Part (d) as exercises but prove 
Part (d) here to give the flavor of how such proofs go. 

So suppose that m|n and m|q. Then n = cm and q = dm for some c 
and d. Therefore, un + ug = u(cm) + v(dm) = (uc + vd)m. Thus, from the 
definition, m | (un + vq). 0 


Having the concept of a divisor of an integer, we now want to introduce 
that of the greatest common divisor of two (or more) integers. Simply 
enough, this should be the largest possible integer that is a divisor of both in- 
tegers in question. However, we want to avoid using the size of an integer— 
for reasons that may become clear much later when we talk about rings. So 
we make the definition in what may seem as a strange way. 


Definition. Given a, b (not both 0), then their greatest common divi- 
sor c is defined by: 


(a) c>0. 
(b) claandc|b. 
(c) Ifd|aandd|b, then d|c. 


We write this c as c = (a, b). 


In other words, the greatest common divisor of a and 5 1s the positive 
number c which divides a and b and is divisible by every d which divides a 
and b. 

Defining something does not guarantee its existence. So it is incumbent 
on us to prove that (a, b) exists, and is, in fact, unique. The proof actually 
shows more, namely that (a, b) is a nice combination of a and b. This combi- 
nation is not unique; for instance, 


(24,9) =3 =3-9 + (-1)24 = (-5)9 + 2-24. 


Theorem 1.5.3. Ifa, b are not both 0, then their greatest common divi- 
sor c = (a, b) exists, is unique, and, moreover, c = m,a + nob for some suit- 
able m, and no. 
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Proof. Since not both a and b are 0, the set A = {ma + nb|m,n € Z} 
has nonzero elements. If x € A and x < 0, then —x is also in A and —x > 0, 
for if x = m,a + n,b, then —x = (—m,)a + (—n,)b, so is in A. Thus A has 
positive elements; hence, by the well-ordering principle there is a smallest 
positive element, c, in A. Since c € A, by the form of the elements of A we 
know that c = mya + nob for some mp, No. 

We claim that c is our required greatest common divisor. First note that 
if d|a and d|b, then d| (mya + nob) by Part (d) of Lemma 1.5.2, that is, 
d|c. So, to verify that c is our desired elenient, we need only show that c | a 
and c |b. 

By Euclid’s Algorithm, a = qc + r, where 0 = r < c, that is, a = 
q(myoa + nob) + r. Therefore, r = —qnyb + (1 — qm,)a. So r is in A. But 
r <c and is in A, so by the choice of c, r cannot be positive. Hence r = 0; in 
other words, a = qc and so c | a. Similarly, c | b. 

For the uniqueness of c, if t > 0 also satisfied ¢| a, t| b and d | t for all d 
such that d|a and d| b, we would have t|c and c|t. By Part (f) of Lemma 
1.5.2 we get that t = c (since both are positive). 1 


Let’s look at an explicit example, namely a = 24, b = 9. By direct exami- 
nation we know that (24, 9) = 3; note that 3 = 3-9 + (—1)24. What is 
(—24, 9)? 

How is this done for positive numbers a and b which may be quite large? 
If b > a, interchange a and b so that a > b > 0. Then we can find (a, b) by 


1. observing that (a, b) = (b, r) where a = qb + rwithO =r < b (Why?); 
2. finding (b, r), which now is easier since one of the numbers is smaller 
than before. 


So, for example, we have 


(100, 28) = (28, 16) since 100 = 3 (28) + 16 
( 28, 16) = (16, 12) since 28 = 1 (16) + 12 
( 16,12) = (12, 4) since 16=1(12)+ 4 


This gives us 
(100, 28) = (12, 4) = 4. 
It is possible to find the actual values of m, and ny such that 


4 =m, 100 + ny 28 
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by going backwards through the calculations made to find 4: 


Since 16=1(12)+ 4, 4= 16+ (-1)12 

Since 28 = 1 (16) + 12, 12= 28+ (—1) 16 

Since 100 = 3 (28) + 16, 16 = 100 + (—3) 28 
But then 


4= 16 + (-1) 12 = 16 + (—1)(28 + (—1) 16) 
= (—1) 28 + (2) 16 = (—1) 28 + (2)(100 + (—3) 28) 
= (2) 100 + (—7) 28 
so that my) = 2 and ny = —7. 
This shows how Euclid’s Algorithm can be used to compute (a, b) for 
any positive integers a and b. 
We shall include some exercises at the end of this section on other 


properties of (a, b). 
We come to the very important 


Definition. We say that a and b are relatively prime if (a, b) = 1. 


So the integers a and b are relatively prime if they have no nontrivial 
common factor. An immediate corollary to Theorem 1.5.3 is 


Theorem 1.5.4. The integers a and 5 are relatively prime if and only if 
1 = ma + nb for suitable integers m and n. 


Theorem 1.5.4 has an immediate consequence 


Theorem 1.5.5. If a and b are relatively prime and a | bc, then a| c. 


Proof. By Theorem 1.5.4, ma + nb = 1 for some m and n, hence 
(ma + nb)c = c, that is, mac + nbc = c. By assumption, a| bc and by obser- 
vation a| mac, hence a| (mac + nbc) and so a|c. 0 


Corollary. If b and c are both relatively prime to a, then bc is also rel- 
-atively prime to a. 


Proof. We pick up the proof of Theorem 1.5.5 at mac + nbc = c. If d = 
(a, bc), then d|a and d| bc, hence d| (mac + nbc) = c. Since d|a and d|c 


26 Things Familiar and Less Familiar Ch. 1 


and (a, c) = 1, we get that d = 1. Since 1 = d = (a, bc), we have that bc is 
relatively prime to a. L] 


We now single out an ultra-important class of positive integers, which 
we met before in Problem 30, Section 4. 


Definition. A prime number, or a prime, is an integer p > 1, such that 
for any integer a either p | a or p is relatively prime to a. 


This definition coincides with the usual one, namely that we cannot fac- 
tor p nontrivially. For if p is a prime as defined above and p = ab where 
1 <a< p, then (a, p) = a (Why?) and p does not divide a since p > a. It fol- 
lows that a = 1, so p = b. On the other hand, if p is a prime in the sense that 
it cannot be factored nontrivially, and if a is an integer not relatively prime to 
p, then (a, b) is not 1 and it divides a and p. But then (a, b) equals p, by our 
hypothesis, so p divides a. 

Another result coming out of Theorem 1.5.5 is 


Theorem 1.5.6. If p is a prime and p|(a,a, --- a,), then p|a; for 
some iwith1 Si=n. 


Proof. If p|a,, there is nothing to prove. Suppose that p{a,; then p 
and a, are relatively prime. But p|a,(a, --- a,), hence by Theorem 1.5.5, 
p|a,--:a,. Repeat the argument just given on a,, and continue. C] 


The primes play a very special role in the set of integers larger than 1 in 
that every integer n > 1 is either a prime or is the product of primes. We 
shall show this in the next theorem. In the theorem after the next we shall 
show that there is a uniqueness about the way n > 1 factors into prime fac- 
tors. The proofs of both these results lean heavily on the well-ordering prin- 
ciple. 


Theorem 1.5.7. If > 1, then either 1 1s a prime or x Is the product of 
primes. 


Proof. Suppose that the theorem is false. Then there must be an 
intger m > 1 for which the theorem fails. Therefore, the set M for which the 
theorem fails is nonempty, so, by the well-ordering principle, M has a least 
element m. Clearly, since m € M, m cannot be a prime, thus m = ab, where 
1<a< mand1< b < m. Because a < m and b < m and mi is the least 
element in M, we cannot have a € M or b € M. Since a € M, b € M, by the 
definition of M the theorem must be true for both a and b. Thus a and b are 
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primes or the product of primes; from m = ab we get that m is a product 
of primes. This puts m outside of M, contradicting that m € M. This proves 
the theorem. [] 


We asserted above that there is a certain uniqueness about the decom- 
position of an integer into primes. We make this precise now. To avoid trivi- 
alities of the kind 6 = 2-3 = 3 - 2 (so, in a sense, 6 has two factorizations into 
the primes 2 and 3), we shall state the theorem in a particular way. 


Theorem 1.5.8. Given n > 1, then there is one and only one way to 
write n in the form n = p{!ps?-- - pg, where p; < py < ++: < p, are primes 
and the exponents a,,a,,..., a, are all positive. 


Proof. We start as we did above by assuming that the theorem is 
false, so there is a least integer m > 1 for which it 1s false. This m must have 


two distinct factorizations as m = p%\p$2--+ pik = q®’1q2--- qe where 
P1<P2o<*°°* < Des 91 < G2 <*** <q, are primes and where the exponents 
a,,..., a, and b,,..., by are all positive. Since p, | p{1- +: p& = g?!-- + qb, 


by Theorem 1.5.6 p, | q’' for some i; hence, again by Theorem 1.5.6, p, | qj, 
hence p, = q;. By the same token gq, = p; for some j; thus p; = p; = 
41 = 4; = Pp. This gives us that p; = q,. Now since m/p,; < m, m/p, 


has the unique factorization property. But m/p, = p{)' p$2--+ pe = 


p?—| qb -#+ qe and since m/p, can be factored in one and only one way in 
this form, we easily get k = €, po = G2, --- , Pe = Me, 4, ~ 1 = OB, — 1, 
a,= bz, ... , a, = b,.So we see that the primes and their exponents arising 
in the factorization of m are unique. This contradicts the lack of such unique- 


ness for m, and so proves the theorem. (1 


What these last two theorems tell us is that we can build up the integers 
from the primes in a very precise and well-defined manner. One would ex- 
pect from this that there should be many—that is, an infinity—of primes. 
This old result goes back to Euclid; in fact, the argument we shall give is due 
to Euclid. 


Theorem 1.5.9. There is an infinite number of primes. 


Proof. If the result were false, we could enumerate al/ the primes in 
P1,P2,-+-+, Px. Consider the integer g = 1 + p,p2::: p,. Since q > p; for 
every i = 1,2,...,k, q cannot be a prime. Since p; | q, for we get a remain- 
der of 1 on dividing q by p,, q is not divisible by any of p,,..., py. So g Is 
not a prime nor is it divisible by any prime. This violates Theorem 1.5.7, 
thereby proving the theorem. [] 
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Results much sharper than Theorem 1.5.9 exist about how many primes 


there are up to a given point. The famous prime number theorem states that 
for large n the number of primes less than or equal to n is “more or less” 
n/log,n, where this “more or less” is precisely described. There are many 
open questions about the prime numbers. 


PROBLEMS 


Easier Problems 


1. Find (a, b) and express (a, b) as ma + nb for: 


(a) (116, —84). 
(b) (85, 65). 
(c) (72, 26). 
(d) (72, 25). 


2. Prove all the parts of Lemma 1.5.2, except part (d). 

3. Show that (ma, mb) = m(a, b) if m > 0. 

4. Show that if a| m and b | m and (a, b) = 1, then (ab) | m. 

5. Factor the following into primes. 

(a) 36. 

(b) 120. 
(c) 720. 
(d) 5040. 

6. If m = pi--- pe and n = pi{!--- prs, where p;,..., p, are distinct 
primes and a,,..., a, are nonnegative and b,,... , b, are nonnegative, 
express (m, n) as pj}: - p* by describing the c’s in terms of the a’s and 
b’s. 

*7, Define the least common multiple of positive integers m and n to be the 


10. 


smallest positive integer v such that both m |v and n | v. 

(a) Show that v = mn/(m, n). 

(b) In terms of the factorization of m and n given in Problem 6, what is 
v? 


. Find the least common multiple of the pairs given in Problem 1. 
. If m, n > 0 are two integers, show that we can find integers u, v with 


—n/2 Sv = n/2 such that m = un + v. 


To check that a given integer n > 1 is a prime, prove that it is enough to 
show that n is not divisible by any prime p with p = Vn. 
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11. Check if the following are prime. 
(a) 301. 
(b) 1001. 
(c) 473. 

12. Starting with 2, 3, 5, 7,..., construct the positive integers 1 + 2:3, 
1+2-3:5,1+2-3-5-7,... ..Do you always get a prime number this 
way? 


Middle-Level Problems 


13. If p is an odd prime, show that p is of the form: 
(a) 4n + 1 or 4n + 3 for some n. 
(b) 6n + 1 or 6n + 5 for some n. 

14. Adapt the proof of Theorem 1.5.9 to prove: 
(a) There is an infinite number of primes of the form 4n + 3. 
(b) There is an infinite number of primes of the form 6n + 5. 

15. Show that no integer u = 4n + 3 can be written as u = a* + b’, where 
a, b are integers. 

16. If T is an infinite subset of N, the set of all positive integers, show that 
there is a 1-1 mapping of T onto N. 

17. If p isa prime, prove that one cannot find nonzero integers a and b such 
that a2 = pb’. (This shows that Vp is irrational.) 


6. MATHEMATICAL INDUCTION 


If we look back at Section 5, we see that at several places—for instance, in 
the proof of Theorem 1.5.6—we say “argue as above and continue.” This is 
not very satisfactory as a means of nailing down an argument. What is clear 
is that we need some technique of avoiding such phrases when we want to 
prove a proposition about all the positive integers. This is provided for us by 
the Principle of Mathematical Induction; in fact, this will be the usual method 
that we shall use for proving theorems about all the positive integers. 


Theorem 1.6.1. Let P(n) be a statement about the positive integers 
such that: 
(a) P(1) is true. 
(b) If P(k) happens to be true for some integer k = 1, then P(k + 1) is also 


true. 


Then P(n) is true for all n = 1. 
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Proof. Actually, the arguments given in proving Theorems 1.5.7 and 
1.5.8 are a prototype of the argument we give here. 

Suppose that the theorem is false; then, by well-ordering, there is a 
least integer m = 1 for which P(m) is not true. Since P(1) is true, m # 1, 
hence m > 1. Now1 =m —1<™m,so by the choice of m, P(m — 1) must be 
valid. But then by the inductive hypothesis [Part (b)] we must have that P(m) 
is true. This contradicts that P(m) is not true. Thus there can be no integer 
for which P is not true, and so the theorem is proved. L] 


We illustrate how to use induction with some rather diverse examples. 


Examples 


1. Suppose that 7 tennis balls are put in a straight line, touching each other. 
Then we claim that these balls make n — 1 contacts. 


Proof. If n = 2, the matter is clear. If for k balls we have k — 1 con- 
tacts, then adding one ball (on a line) adds one contact. So k + 1 balls would 
have k contacts. So if P(n) is what is stated above about the tennis balls, we 
see that if P(k) happens to be true, then so is P(k + 1). Thus, by the theo- 
rem, P(n) is true for all n = 1.0 


2. If p is a prime and p | a,a,---a,, then p|a;for some 1 Si <n. 


Proof. Let P(n) be the statement in Example 2. Then P(1) is true, for 
if p | a,, it certainly divides a; for some 1 <i < 1. 

Suppose we know that P(k) is true, and that p | a,a,--- a, a,4,,. Thus, 
by Theorem 1.5.6, since p|(a,a, --- a,)a,4, either p|a,,, (a desired con- 
clusion) or p | a, --- a,. In this second possibility, since P(k) is true we have 
that p | a; for some 1 = i = k. Combining both possibilities, we get that p | a; 
for some 1 = j = k + 1. So Part (b) of Theorem 1.6.1 holds; hence P(v) is 
true for alln = 1.0 


3.Forn21,1+2+---+n=$n(n+ 1). 


Proof. If P(n) is the proposition that 1 + 2+.---+n=sn(n + 1), then 
P(1) is certainly true, for 1 = $(1 + 1). If P(k) should be true, this means that 


1+2+---+k = $k(k + 1). 


The question is: Is P(k + 1) then also true, that is, is 1 +2+:--+ k + 
(kK +1) = 5(kK + 1)(kK +1) + 1)? Now1+2+---+k + (kK +1) = 
(1 +2+---k)+(k +1) =$§k(k + 1) + (k + 1), since P(k) is valid. But 
ak(k +1) + (kK +1) =$3(k(k + 1) + 2(k +:1)) = $(k + 1)(k + 2), which 
assures us that P(k + 1) is true. Thus the proposition 1 + 2+---+n= 
3n(n + 1) is true for alln = 1.0 
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We must emphasize one point here: Mathematical induction is not a 
method for finding results about integers; it is a means of verifying a result. 
We could, by other means, find the formula given above for1 +2 +---+n. 

Part (b) of Theorem 1.6.1 is usually called the induction step. 

In the problems we shall give some other versions of the principle of in- 
duction. 


PROBLEMS 
Easier Problems 


1. Prove that 17 + 27+ 3° +--+ +n? = n(n + 1)(2n + 1) by induction. 

2. Prove that 1° + 27 +--+ + n°? = $n?(n + 1)* by induction. 

3. Prove that a set having n = 2 elements has 3n(n — 1) subsets having ex- 
actly two elements. 

4. Prove that a set having n = 3 elements has n(n — 1)(n — 2)/3! subsets 
having exactly three elements. 

5. Ifn = 4 and S is a set having n elements, guess (from Problems 3 and 4) 
how many subsets having exactly 4 elements there are in S. Then verify 
your guess using mathematical induction. 

*6. Complete the proof of Theorem 1.5.6, replacing the last sentence by an 
induction argument. 

7. Ifa # 1, prove that] +a+a°+--++ a" = (a"*'! — 1)/(a — 1) by induc- 
tion. 

8. By induction, show that 


1 1 1 oon 
eo 26 "Ge aed: 


*9, Suppose that P(n) is a proposition about positive integers n such that 
P(n,) is valid, and if P(x) is true, so must P(k + 1) be. What can you say 
about P(n)? Prove your statement. 

*10. Let P(n) be a proposition about integers n such that P(1) is true and 
such that if P(/) is true for all positive integers j < k, then P(k) is true. 
Prove that P(7) is true for all positive integers n. 


Middle-Level Problems 


11. Give an example of a proposition that is not true for any positive integer, 
yet for which the induction step [Part (b) of Theorem 1.6.1] holds. 


12. Prove by induction that a set having n elements has exactly 2” subsets. 
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13. Prove by induction on n that n° — n is always divisible by 3. 

14. Using induction on n, generalize the result in Problem 13 to: If p is a prime 
number, then n? — n is always divisible by p. (Hint: The binomial theorem.) 

15. Prove by induction that for a set having n elements the number of 1-1 
mappings of this set onto itself is n!. 


7. COMPLEX NUMBERS 


We all know something about the integers, rational numbers, and real num- 
bers—indeed, this assumption has been made for some of the text material 
and many of the problems have referred to these numbers. Unfortunately, 
the complex numbers and their properties are much less known to present- 
day college students. At one time the complex numbers were a part of the 
high school curriculum and the early college one. This is no longer the case. 
So we shall do a rapid development of this very important mathematical set. 

The set of complex numbers, C, is the set of all a + bi, where a, b are 
real and where we declare: 


1. a+ bi=c + di, for a, b,c, d real, if and only if a = c and b = d. 
2. (a+ bi) + (c+ di)=(atc)+ (b+ didi. 
3. (a + bi)(c + di) = (ac — bd) + (ad + be)i. 


This last property—multiplication—can best be remembered by using 
i? = —1 and multiplying out formally with this relation in mind. 

For the complex number z = a + bi, ais called the real part of z and b 
the imaginary part of z. If ais 0, we call z purely imaginary. 

We shall write 0 + Oi as 0 anda + Oi asa. Note thatz +O0=2z,zl =z 
for any complex number z. 

Given z = a + bi, there is a complex number related to z, which we 
write as Z, defined by Z = a — bi. This complex number, Z, is called the 
complex conjugate of z. Taking the complex conjugate gives us a mapping 


of C onto itself. We claim 


Lemma 1.7.1. If z,w € C, then: 


(a) (Z) =z. 

(b) (Z+w)=Z+W. 

(c) (zw) = Zw 

(d) zz is real and nonnegative and is, in fact, positive if z # 0. 
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(e) z + Zis twice the real part of z. 
(f) z — Zis twice the imaginary part of z times i. 


Proof. Most of the parts of this lemma are straightforward and merely in- 
volve using the definition of complex conjugate. We do verify Parts (c) and (d). 

Suppose that z = a + bi, w = c + di, where a, b,c, d are real. So zw = 
(ac — bd) + (ad + bc)i, hence 


(zw) = (ac — bd) + (ad + bc)i = (ac — bd) — (ad + be)i. 


On the other hand, z = a — biand W = c — di, hence, by the definition of the 
product in C, ZW = (ac — bd) — (ad + bc) i. Comparing this with the result 
that we obtained for (zw), we see that indeed (zw) = Z W. This verifies Part (c). 

We go next to the proof of Part (d). Suppose that z = a + bi # 0; then 
Z =a — biand zz = a’ + b’. Since a, b are real and not both 0, a* + b? is 
real and positive, as asserted in Part (d). 1 


The proof of Part (d) of Lemma 1.7.1 shows that if z = a + bi # 0, then 
zz =a’ + b? ¥0and z(Z/(a’ + b’)) = 1,80 


Se ae een ee ee © 
at+h a+b a+ Bb) 


acts like the inverse 1/z of z. This allows us to carry out division in C, staying 
in C while doing so. 
We now list a few properties of C. 


Lemma 1.7.2. C behaves under its sum and product according to the 
following: If u,v, w € C, then 


(a) utu=vtu. 

(b) (utv)+w=ut(vtw). 

(c) uv = UU. 

(d) (uv)w = u(vw). 

(ec) u # 0 implies that u~' = 1/u exists in C such that uu™' = 1. 


Proof. We leave the proofs of these various parts to the reader. [] 


These properties of C make of C what we shall call a field, which we 
shall study in much greater depth later in the book. What the lemma says is 
that we are allowed to calculate in C more or less as we did with real num- 
bers. However, C has a much richer structure than the set of real numbers. 


34 Things Familiar and Less Familiar Ch. 1 
We now introduce a “size” function on C. 


Definition. If z= a+ bi € C, then the absolute value of z, written as 


|z|, is defined by |z| = Vzz = Va? + b’. 


We shall see, in a few moments, what this last definition means geomet- 
rically. In the meantime we prove 


Lemma 1.7.3. If u,v € C, then |uv| = |u| |v]. 
Proof. By definition, |u| = Vuu and |v| = Vvv. Now 


Juv] = V(uv) (uv) = V(uv) (a0) (by Part (c) of Lemma 1.7.1) 
= V(ul)(vv) (by Lemma 1.7.2) 
= Vuu Vo = |u| |v|.O 
Another way of verifying this lemma is to write u = a + bi,vu =c + di, 
uv = (ac — bd) + (ad + bc)i and to note the identity 
(ac — bd)? + (ad + bc)? = (a? + b?)(c* + d?). 


Note several small points about conjugates. If z © C, then z is real if 
and only if z = z, and z is purely imaginary if and only if z = —z. If 
z,w EC, then 


(zw + Zw) = ZW + ZW = Zw t+ W, 


so ZW + Zw is real. We want to get an upper bound for |zw + Zw); this will 
come up in the proof of Theorem 1.7.5 below. 

But first we must digress for a moment to obtain a statement about 
quadratic expressions. 


Lemma 1.7.4. Let a, b, c be real, with a > 0. If aa* + ba + c = 0 for 
every real a, then b* — 4ac S 0. 


Proof. Consider the quadratic expression for a = —b/2a. We get 
a(—b/2a)’ + b(—b/2a) + c = 0. Simplifying this, we obtain that (4ac — b*)/4a 
= 0, and since a > 0, we end up with 4ac — b* = 0, and so b? — 4ac = 0.0 


We use this result immediately to prove the important 


Theorem 1.7.5 (Triangle Inequality). For z,w € C,|z + w| S|z| + |w]. 


Proof. If z = 0, there is nothing to prove, so we may assume that z # 0; 
thus zz > 0. Now, for a real, 
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0< jaz + wi = (az + w) (az + w) = (az + w)(az + W) 
= azz + a(zw + Zw) + ww. 
If a= 2zz>0,b = zW + zw,c = ww, then Lemma 1.7.4 tells us that 
b* — 4ac = (zw + zw)? — 4(zz)(ww) <0, hence (zw + Zw)? S 4(zz)(ww) = 
4\z|*|w|*. Therefore, zw + zw S2|z| |wI. 
For a = 1 above, 


Iz + wl? = zz + ww + zw t+ Zw = |z|* + |wl? + zw + Zw 


IA 


|z/? + |wi? + 2\z] |w| 


from the result above. In other words, |z + w|? S (|z| + |w|)?; taking square 
roots we get the desired result, |z + w| s|z| + |w].0 


Why is this result called the triangle inequality? The reason will be clear 
once we view the complex numbers geometrically. Represent the complex 
number z = a + bias the point having coordinates (a, b) in the x-y plane. 


r (a, b) 
Ya 


The distance r of this point from the origin is Va? + b*, in other words, |z]. 
The angle @ is called the argument of z and, as we see, tan 6 = b/a. Also, 
a=rcos 0,b = rsin 6; therefore, z = a + bi = r(cos 6 + isin 6). This rep- 
resentation of z is called its polar form. 

Given z = a + bi, w = c + di, then their sum is z + w = (a + c) + 
(b + d)i. Geometrically, we have the picture: 


(at+c,b+dad) 
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The statement |z + w| < |z| + |w| merely reflects the fact that in a triangle 
one side is of smaller length than the sum of the lengths of the other two 
sides; thus the term triangle inequality. 

The complex numbers that come up in the polar form cos 6 + i sin 6 
are very interesting numbers indeed. Specifically, 


lcos 6 + isin 6] = Vcos? 6 + sin? 6 = V1 = 1, 


so they give us many complex numbers of absolute value 1. In truth they give 
us all the complex numbers of absolute value 1; to see this just go back and 
look at the polar form of such a number. 

Let’s recall two basic identities from trigonometry, cos(@é + wW) = 
cos 6 cos & — sin @ sin w and sin(@ + Ww) = sin 6 cos w + cos 6 sin ws. There- 
fore, if z = r(cos 6 + isin 8) and w = s(cos w + isin W), then 


zw = r(cos 6 + isin @) - s(cos w + isin W) 


= rs(cos 6 cos & — sin @sin yw) + irs(sin 6 cos y + cos Osin wW) 


rs[cos(@ + yw) + isin(é + w)]. 


Thus, in multiplying two complex numbers, the argument of the product is 
the sum of the arguments of the factors. 
This has another very interesting consequence. 


Theorem 1.7.6 (De Moivre’s Theorem). For any integer n = 1, 
[r(cos 6 + isin 6)]" = r"[cos(n6) + i sin(n8)]. 


Proof. We proceed by induction on n. If n = 1, the statement is 
obviously true. Assume then that for some k, [r(cos@ + isin 6)]* 
r“[cos k@ + isin k6]. Thus 


[r(cos 6 + isin 6)]**' = [r(cos 6 + isin @)]* - r(cos 6 + isin 6) 


r*(cos k@ + isin k@) - r(cos 6 + isin @) 
= r**'Tcos(k + 1)6 + isin(k + 1)6] 


by the result of the paragraph above. This completes the induction step; 
hence the result is true for all integers n = 1. 


In the problems we shall see that De Moivre’s Theorem is true for all 
integers m; in fact, it is true even if m is rational. 
Consider the following special case: 


LF ah OE 
6,, = cos or + isin ex where n = 1 is an integer. 
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By De Moivre’s Theorem, 
(co ) + isin (27)) 
n n 
= cos ( (22)) + isin (: (27)) 
n n 


= cos27 + isin27 = 1. 


So 6% = 1; you can verify that 07° # 1 if 0 < m <n. This property of 6,, makes 
it one of the primitive nth roots of unity, which will be encountered in Prob- 
lem 26. 


PROBLEMS 
Easier Problems 


1. Multiply. 
(a) (6 — 7i)(8 + i). 
(b) (2 + 21) — 21). 
(c) (6 + 7i)(8 — i). 
2. Express z~' in the form z7! = a + bi for: 
(a) z=6+ 8&1. 
(b) z= 6- 81. 
(c) z= es a = l. 
V2 «V2 
*3. Show that (Z)~! = (z7!). 
4. Find (cos 6 + isin 6)7'. 
5. Verify parts a, b, e, fof Lemma 1.7.1. 
*6, Show that z is real if and only if Z = z, and is purely imaginary if and 
only if Z = —z. 
7. Verify the commutative law of multiplication zw = wz in C. 
8. Show that for z # 0, |z~"| = 1/|z]. 


9, Find: 
(a) |6 — 4il. 
(b) [5+ gil. 


1 Deo 

—S—_ + —_—— : 

o 3 al 
10. Show that {Z| = |z|. 
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11. Find the polar form for 


ee eae as 
2 V2 
(b) z = 4 
(c) eee 
V2 = «V2 
13 39 
(d) z = ay Dae 


12. Prove that (cos($6) + isin(S6))* = cos 6 + isin 0. 


13. By direct multiplication show that (2 + 343i)? = —1. 


Middle-Level Problems 


14. Show that (cos 6 + isin 0)” = cos(mé) + i sin(m68) for all integers m. 

15. Show that (cos 0 + isin 0)’ = cos(r@) + isin(ré) for all rational num- 
bers r. 

16. If z © Cand n = 1 is any positive integer, show that there are n distinct 
complex numbers w such that z = w”. 

17. Find the necessary and sufficient condition on k such that: 


[co (22) + 1Sin =) = 1 and 
n n 
(co <a + 1Sin (22) | Al if0O<m<n. 


18. Viewing the x-y plane as the set of all complex numbers x + iy, show that 
multiplication by 7 induces a 90° rotation of the x-y plane in a counter- 
clockwise direction. 

19. In Problem 18, interpret geometrically what multiplication by the com- 
plex number a + bi does to the x-y plane. 

«20. Prove that |z + w/? + |z — wi? = 2(|z|? + |w|?). 

21. Consider the set A = {a + bila, b € Z}. Prove that there is a 1-1 corre- 
spondence of A onto N. (A is called the set of Gaussian integers.) 

22. If a is a (complex) root of the polynomial 


x" + ax" '+-+-+ a,x + @,, 


where the a; are real, show that @ must also be a root. [r is a root of a 
polynomial p(x) if p(r) = 0.] 
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Harder Problems 


23. Find the necessary and sufficient conditions on z and w in order that 
Iz + wl = |z| + [w]. 
24. Find the necessary and sufficient conditions on z,,--:, Z, in order that 
Peg Re a | ecard ls 
*25. The complex number @ is said to have order n = 1 if 0” = 1 and 0” # 1 
for 0 < m <n. Show that if 6 has order n and 6* = 1, where k > 0, then 
n|k. 
*26. Find all complex numbers 6 having order n. (These are the primitive nth 
roots of unity.) 
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GROUPS 


1. DEFINITIONS AND EXAMPLES OF GROUPS 


We have seen in Section 4 of Chapter 1 that given any nonempty set, the set 
A(S) of all 1-1 mappings of S onto itself is not just a set alone, but has a far 
richer texture. The possibility of combining two elements of A(S) to get 
yet another element of A(S) endows A(S) with an algebraic structure. We 
recall how this was done: If f, g © A(S), then we combine them to form the 
mapping fg defined by (fg)(s) = f(g(s)) for every s € S. We called fg the 
product of f and g, and verified that fg € A(S), and that this product obeyed 
certain rules. From the myriad of possibilities we somehow selected four par- 
ticular rules that govern the behavior of A(S) relative to this product. 
These four rules were 


1. Closure, namely if f, g € A(S), then fg € A(S). We say that A(S) is 
closed under this product. 

2. Associativity, that is, given f, g,h € A(S), then f(gh) = (fg)h. 

3. Existence of a unit element, namely, there exists a particular element 
i € A(S) (the identity mapping) such that fi = if = f for all f © A(S). 

4. Existence of inverses, that is, given f € A(S) there exists an element, 
denoted by f~', in A(S) such that ff~' = f-'f =i. 


To justify or motivate why these four specific attributes of A(S) were 
singled out, in contradistinction to some other set of properties, is not easy to 


40 
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do. In fact, in the history of the subject it took quite some time to recognize 
that these four properties played the key role. We have the advantage of his- 
torical hindsight, and with this hindsight we choose them not only to study 
A(S), but also as the chief guidelines for abstracting to a much wider context. 

Although we saw that the four properties above enabled us to calculate 
concretely in A(S), there were some differences with the kind of calculations 
we are used to. If S has three or more elements, we saw in Problem 15, 
Chapter 1, Section 4 that it is possible for f, g © A(S) to have fg # gf. How- 
ever, this did not present us with insurmountable difficulties. 

Without any further polemics we go to the 


Definition. A nonempty set G is said to be a group if in G there is de- 
fined an operation * such that: 


(a) a, b € Gimplies that a * b € G. (We describe this by saying that G is 
closed under *.) 

(b) Given a, b,c € G, then a * (b * c) = (a * b) * c. (This is described by 
saying that the associative law holds in G.) 

(c) There exists a special element e € G such that a * e = e * a = a for all 
a € G (eis called the identity or unit element of G). 

(d) For every a € G there exists an element b € G such that a * b = 
b * a = e. (We write this element b as a™' and call it the inverse of 
ain G.) 


These four defining postulates (called the group axioms) for a group 
were, after all, patterned after those that hold in A(S). So it is not surprising 
that A(S) is a group relative to the operation “composition of mappings.” 

The operation * in G is usually called the product, but keep in mind 
that this has nothing to do with product as we know it for the integers, ratio- 
nals, reals, or complexes. In fact, as we shall see below, in many familiar ex- 
amples of groups that come from numbers, what we call the product in these 
groups is actually the addition of numbers. However, a general group need 
have no relation whatsoever to a set of numbers. We reiterate: A group is no 
more, no less, than a nonempty set with an operation * satisfying the four 
group axioms. 

Before starting to look into the nature of groups, we look at some ex- 
amples. 


Examples of Groups 


1. Let Z be the set of all integers and let * be the ordinary addition, +, in Z. 
That Z is closed and associative under * are basic properties of the integers. 
What serves as the unit element, e, of Z under *? Clearly, since a = a * e = 
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a + e, we have e = 0, and 0 is the required identity element under addition. 
What about a~!? Here too, since e = 0=a*a!=a+a',thea! in this in- 
stance is —a, and clearly a * (—a) =a + (—a) = 0. 


2. Let @ be the set of all rational numbers and let the operation * on @ be 
the ordinary addition of rational numbers. As above, @ is easily shown to be 
a group under *. Note that Z C @ and both Z and @ are groups under the 
same operation *. 


3. Let Q’ be the set of all nonzero rational numbers and let the opera- 
tion * on Q’ be the ordinary multiplication of rational numbers. By the fa- 
miliar properties of the rational numbers we see that @' forms a group rela- 
tive to *. 


4. Let R* be the set of all positive real numbers and let the operation * on R* 
be the ordinary product of real numbers. Again it is easy to check that R” is 
a group under *. 


5. Let E,, be the set of 61, i= 0,1,2,...,n — 1, where 6, is the complex num- 
ber 6, = cos(2a/n) + i sin(27/n). Let 0 * 6/ = 0**’, the ordinary product of 
the powers of 6,, as complex numbers. By De Moivre’s Theorem we saw that 

” = 1. We leave it to the reader to verify that E,, is a group under *. The ele- 
ments of E,, are called the n'th roots of unity. The picture below illustrates the 
group E,, whose elements are represented by the dots on the unit circle in the 
complex plane. 


Note one striking difference between the Examples 1 to 4 and Example 
5; the first four have an infinite number of elements, whereas E, has a finite 
number, n, of elements. 


Definition. A group G is said to be a finite group if it has a finite 
number of elements. The number of elements in G is called the order of G and 
is denoted by |G]. 
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Thus E,, above is a finite group, and |E,,| =n. 

All the examples presented above satisfy the additional property that 
a * b = b «a for any pair of elements. This need not be true in a group. Just 
witness the case of A(S), where S has three or more elements; there we saw 
that we could find f, g € A(S) such that fg # ef. 

This prompts us to single out as special those groups of G in which 
axb=bx*aforalla,bDEG. 


Definition. A group G is said to be abelian if a * b = b * a for all 
a,beEG. 


The word abelian derives from the name of the great Norwegian mathematician 
Niels Henrik Abel (1802-1829), one of the greatest scientists Norway has ever 
produced. 


A group that is not abelian is called nonabelian, a not too surprising 
choice of name. 

We now give examples of some nonabelian groups. Of course, the 
A(S) afford us an infinite family of such. But we present a few other exam- 
ples in which we can compute quite readily. 


6. Let R be the set of all real numbers, and let G be the set of all mappings 
T,,»: R — R defined by T, ,(r) = ar + b for any real number r, where a, b 
are real numbers and a # 0. Thus, for instance, T; -, is such that 7; ¢(r) = 
Sr — 6; Ts 6(14) = 5-14 — 6 = 64, T; _¢(a7) = 5a — 6. The T, , are 1-1 map- 
pings of R onto itself, and we let 7, ,, * T,., be the product of two of these 
mappings. So 


(Tap * Tear) = Tao(Te,a(r)) = aTe,a(r) + 6 = a(cr + d) + b 
(ac)r + (ad + b) = Tac, aa+0(")- 


So we have the formula 


Tab * La — i eee ree (1) 


This result shows us that 7, , * T,q is in G—for it satisfies the membership 
requirement for belonging to G—so G 1s closed under *. Since we are talking 
about the product of mappings (i.e., the composition of mappings), * is 
associative. The element 7; ) = i is the identity mapping of R onto itself. 
Finally, what is T;',? Can we find real numbers x # 0 and y, such that 
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Tb * Tis = iim *% Tap a T1.9? 


Go back to (1) above; we thus want 7, ay, = T1,o, that is, ax = 1,ay + b= 0. 
Remember now that a # 0, so if we put x = a’ and y = —a‘'b, the required 
relations are satisfied. One verifies immediately that 


T. .b * T-.-a-'b =f. “lt _guip * (i = T1.0- 


a a 


So G is indeed a group. 
What is 7. 4 * T,,,? According to the formula given in (1), where we re- 
place a by c, c by a, b by d, d by D, we get 


Tea * Lis = eo ch+d* (2) 


Thus T, 4 * T,,, = if T, , * T,,q and only if bc + d = ad + b. This fails to be 
true, for instance, if a = 1, b = 1,c = 2, d = 3. So G is nonabelian. 


7. Let H C G, where G is the group in Example 6, and H is defined by 
H = {T,, € G|a is rational, b any real}. We leave it to the reader to verify 
that H is a group under the operation * defined on G. H is nonabelian. 


8. Let K C HCG, where H, G are as above and K = {T; , € G| b any real}. 
The reader should check that K is a group relative to the operation * of G, 
and that K is, however, abelian. 


9. Let S be the plane, that is, S = {(x, y) |x, y real} and consider f, g € A(S) 
defined by f(x, y) = (—x, y) and g(x, y) = (—y, x); fis the reflection about 
the y-axis and g is the rotation through 90° in a counterclockwise direction 
about the origin. We then define G = { f'g’|i = 0, 1; 7 = 0, 1, 2, 3}, and let * 
in G be the product of elements in A(S). Clearly, f* = g* = identity mapping: 


(f* g(x,y) = (fg)(x, y) = f(g, y)) = f(y, x) = (y, ) 
and 


(g * f(x,y) = (f(x, y)) = g(-x,y) = (-y, —%). 


So g*f # f * g. We leave it to the reader to verify that g * f= f* g-' and G 
is a nonabelian group of order 8. This group is called the dihedral group of 
order 8. [Try to find a formula for (f'g’) * (f*g') = f%g? that expresses a, b 
in terms of i, j, s, and f.] 


10. Let S be as in Example 9 and f the mapping in Example 9. Let n > 2 and 
let h be the rotation of the plane about the origin through an angle of 277/n 
in the counterclockwise direction. We then define G = {f*h’|k = 0, 1; 
j = 0,1, 2,..., — 1} and define the product * in G via the usual product of 
mappings. One can verify that f? = h” = identity mapping, and fh = h"'f. 
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These relations allow us to show (with some effort) that G is a nonabelian 
group of order 2n. G is called the dihedral group of order 2n. 


11. Let G = { fE A(S) | f(s) #5 for only a finite number of s € S}, where we 
suppose that S is an infinite set. We claim that G is a group under the product 
* in A(S). The associativity holds automatically in G, since it already holds in 
A(S). Also, i € G, since i(s) = s for all s © S. So we must show that G is 
closed under the product and if f € G, then f' € G. 

We first dispose of the closure. Suppose that f, g © G; then f(s) = s 
except, say, for 5;,5.,...,5, and g(s) = s except for s;,53,...,58,),. Then 
(fg)(s) = f(g(s)) = s for all s other than 51, 52,..., S,, S{,..-, 5 (and 
possibly even for some of these). So fg moves only a finite number of ele- 
ments of S, so fg € G. 

Finally, if f(s) = s for all s other than s,, 55,...,5,, then f~'(f(s)) = 
f-l(s), but f-'(s) = f '(f6s)) = (f 'F)(s) = i(s) = s. So we obtain that 
f-\(s) = for all s except s,,...,5,. Thus f~' € G and G satisfies all the group 
axioms, hence G is a group. 


12. Let G be the set of all mappings 7,, where 7, is the rotation of a given cir- 
cle about its center through an angle 6 in the clockwise direction. In G define 
* by the composition of mappings. Since, as is readily verified, 7, * Ty, = To+,, 
G is closed under *. The other group axioms check out easily. Note that 
T,, = Tp = the identity mapping, and T,' = T_, = T,,-». G is an abelian group. 


As we did for A(S) we introduce the shorthand notation a” for 


a*t*a*ta:::*a 
n times 


and define a" = (a“')", for n a positive integer, and a° = e. The usual rules 
of exponents then hold, that is, (a”)” = a” and a” * a” = a”"" for any inte- 
gers m and n. 

Note that with this notation, if G is the group of integers under +, then 
a” is really na. 

Having seen the 12 examples of groups above, the reader might get the 
impression that all, or almost all, sets with some operation * form groups. 
This is far from true. We now give some examples of nongroups. In each case 
we check the four group axioms and see which of these fail to hold. 


Nonexamples 


1. Let G be the set of all integers, and let * be the ordinary product of inte- 
gers in G. Since a * b = ab, for a, b © G we clearly have that G is closed and 
associative relative to *. Furthermore, the number 1 serves as the unit ele- 
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ment, since a * 1 = al = a = la = 1 * a for every a € G. So we are three- 
fourths of the way to proving that G is a group. All we need is inverses for 
the elements of G, relative to *, to lie in G. But this just isn’t so. Clearly, we 
cannot find an integer b such that 0 * b = 0b = 1, since 0b = 0 for all b. But 
even other integers fail to have inverses in G. For instance, we cannot find an 
integer b such that 3 * b = 1 (for this would require that b = 3, and 3 is not 
an integer). 


2. Let G be the set of all nonzero real numbers and define, for a, b © G,a * b 
= a’b; thus 4 * 5 = 4°(5) = 80. Which of the group axioms hold in G under 
this operation * and which fail to hold? Certainly, G is closed under *. Is * 
associative? If so, (a * b) *c = a * (b * c), that is, (a * b)*c = a*(b * c), and 
so (a*b)’c = a*(b*c), which boils down to a* = 1, which holds only for 
a = +1. So, in general, the associative law does not hold in G relative to *. 
We similarly can verify that G does not have a unit element. Thus even to 
discuss inverses relative to * would not make sense. 


3. Let G be the set of all positive integers, under * where a * b = ab, the 
ordinary product of integers. Then one can easily verify that G fails to be a 
group only because it fails to have inverses for some (in fact, most) of its ele- 
ments relative to *. 


We shall find some other nonexamples of groups in the exercises. 


PROBLEMS 
Easier Problems 


1. Determine if the following sets G with the operation indicated form a 

group. If not, point out which of the group axioms fail. 

(a) G = set of all integers, a * b =a — b. 

(b) G = set of all integers,a*b =a+b+ ab. 

(c) G = set of nonnegative integers,a*b=a+t b. 

(d) G = set of all rational numbers # —1,a*b=a+b+ab. 

(e) G = set of all rational numbers with denominator divisible by 5 (writ- 
ten so that numerator and denominator are relatively prime), a * b = 
a+ b. 

(f) Gaset having more than one element, a * b = a for alla, b € G. 


2. In the group G defined in Example 6, show that the set H = {T,,|a = +1, 
b any real} forms a group under the * of G. 


3. Verify that Example 7 is indeed an example of a group. 
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4. Prove that K defined in Example 8 is an abelian group. 


17. 
*18. 


19. 


20. 


. In Example 9, prove that g * f = f * g"', and that G is a group, is non- 


abelian, and is of order 8. 


. Let G and H be as in Examples 6 and 7, respectively. Show that if 


T,, € G,then7,,*V*7T,7,€ Hif VE H. 


. Do Problem 6 with H replaced by the group K of Example 8. 

. If Gis an abelian group, prove that (a * b)” = a” * b” for all integers n. 

. If Gis a group in which a’ = e for all a € G, show that G is abelian. 

. If Gis the group in Example 6, find all T, , © G such that T, , * T;., = 


T,., * T,,» for all real x. 


. In Example 10, for n = 3 find a formula that expresses ( f'h’) * ( f*h') as 


fh". Show that G is a nonabelian group of order 6. 
Do Problem 11 for n = 4. 


. Show that any group of order 4 or less is abelian. 
. If G is any group and a, b,c € G, show that if a * b = a*c, then b =, 


andifb*a=c*a,then b =. 


. Express (a * b)~' in terms of a~' and Db". 


Using the result of Problem 15, prove that a group G in which a = a"! 


for every a © G must be abelian. 
In any group G, prove that (a~-')~' = a for alla € G. 


If G is a finite group of even order, show that there must be an element 
a # e such that a = a“ '. (Hint: Try to use the result of Problem 17.) 

In S;, show that there are four elements x satisfying x” = e and three ele- 
ments y Satisfying y° = e. 

Find all the elements in S, such that x* = e. 


Middle-Level Problems 


21. 
22. 


23. 


24. 


Show that a group of order 5 must be abelian. 

Show that the set defined in Example 10 is a group, is nonabelian, and 

has order 2n. Do this by finding the formula for ( f'h’) * (f*h') in the 

form f7h’. 

In the group G of Example 6, find all elements U © G such that 

U«T,, = 1, * U for every T,, € G. 

If G is the dihedral group of order 27 as defined in Example 10, prove that: 

(a) Ifnis odd anda € Gis such that a * b = b « a for all b € G, thena = e. 

(b) If is even, show that there is ana € G,a # e,suchthata*b=b*a 
for allb € G. 


48 


25. 


Groups Ch. 2 


(c) If n is even, find all the elements a € G such thata*b=bé#a 
for allb € G. 

If G is any group, show that: 

(a) e is unique (i.e., if f € G also acts as a unit element for G, then f = e). 

(b) Given a € G, then a™! € Gis unique. 


*26. If G is a finite group, prove that, given a € G, there is a positive integer 


n, depending on a, such that a” = e. 


*27. In Problem 26, show that there is an integer m > 0 such that a” = e 


for alla € G. 


Harder Problems 


28. 


29. 


30. 


31. 


Let G be a set with an operation * such that: 

1. Gis closed under «. 

2. * 1S associative. 

3. There exists an element e € G such that e * x = x for all x € G. 

4. Given x € G, there exists a y € G such that y * x =e. 

Prove that G is a group. (Thus you must show that x * e = x andx * y =e 

for e, y as above.) 

Let G be a finite nonempty set with an operation * such that: 

1. Gis closed under *. 

2. * 1S associative. 

3. Given a, b,c € Gwitha *b =a*c, then b =c. 

4. Given a, b,c, © G with b *a =c *a, then b =. 

Prove that G must be a group under *. 

Give an example to show that the result of Problem 29 can be false if G 

is an infinite set. 

Let G be the group of all nonzero real numbers under the operation * 

which is the ordinary multiplication of real numbers, and let H be the 

group of all real numbers under the operation #, which is the addition of 

real numbers. 

(a) Show that there is a mapping Ff: G — H of G onto H which satisfies 
F(a * b) = F(a)#F(b) for all a, b € G [i.e., F(ab) = F(a) + F(b)]. 

(b) Show that no such mapping F can be 1-1. 


2. SOME SIMPLE REMARKS 


In this short section we show that certain formal properties which follow from 
the group axioms hold in any group. As a matter of fact, most of these results 
have already occurred as problems at the end of the preceding section. 
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It is a little clumsy to keep writing the * for the product in G, and from 
now on we Shall write the product a * b simply as ab for all a, b € G. 
The first such formal results we prove are contained in 


Lemma 2.2.1. If G is a group, then: 


(a) Its identity element is unique. 

(b) Every a € G has a unique inverse a' € G. 
(c) Ifa€ G,(a')' =a. 

(d) Fora, b € G, (ab)! =b"'a"'. 


Proof. We start with Part (a). What is expected of us to carry out the 
proof? We must show that if e, f € G and af = fa = a for all a © G and 
ae = ea = a for all a € G, then e = f. This is very easy, for then e = ef and 
f = ef; hence e = ef = f, as required. 

Instead of proving Part (b), we shall prove a stronger result (listed 
below as Lemma 2.2.2), which will have Part (b) as an immediate conse- 
quence. We claim that in a group G if ab = ac, then b = c; that is, we can 
cancel a given element from the same side of an equation. To see this, we 
have, for a € G, an element u € G such that ua = e. Thus from ab = ac we 
have 


u(ab) = u(ac), 


so, by the associative law, (ua)b = (ua)c, that is, eb = ec. Hence b = eb = 
ec = c, and our result is established. A similar argument shows that if 
ba = ca, then b = c. However, we cannot conclude from ab = ca that b = c; 
in any abelian group, yes, but in general, no. 

Now to get Part (b) as an implication of the cancellation result. Suppose 
that b,c € G act as inverses for a; then ab = e = ac, so by cancellation b = c 
and we see that the inverse of a is unique. We shall always write it as a~'. 

To see Part (c), note that by definition a '(a~')~! = e; but a"'a = e, so 
by cancellation in a~'(a~')"' = e = a” 'a we get that (a')"' =a. 

Finally, for Part (d) we calculate 


(ab)(b~'a~') = ((ab)b™')a™! (associative law) 
= (a(bb"')a~' __ (again the associative law) 


1 


= (ae)a~' = aa"! = e. 


Similarly, (b~'a~')(ab) = e. Hence, by definition, (ab)~! = b-'!a7!. 0 
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We promised to list a piece of the argument given above as a separate 
lemma. We keep this promise and write 


Lemma 2.2.2. In any group G and a, b, c € G, we have: 


(a) If ab = ac, then b = c. 
(b) If ba = ca, then b =c. 


Before leaving these results, note that if G is the group of real 
numbers under +, then Part (c) of Lemma 2.2.1 translates into the familiar 
—(-a) =a. 

There is only a scant bit of mathematics in this section; accordingly, 
we give only a few problems. No indication is given as to the difficulty of these. 


PROBLEMS | 


1. Suppose that G is a set closed under an associative operation such that 

1. given a, y € G, there is an x € G such that ax = y, and 

2. given a, w € G, there is au € G such that ua = w. 

Show that G is a group. 
*2. If G is a finite set closed under an associative operation such that ax = ay 
forces x = y and ua = wa forces u = w, for every a, x, y, u, w € G, prove 
that G is a group. (This is a repeat of a problem given earlier. It will be 
used in the body of the text later.) 


3. If G is a group in which (ab)' = a‘b' for three consecutive integers i, prove 
that G is abelian. 

4. Show that the result of Problem 3 would not always be true if the word 
“three” were replaced by “two.” In other words, show that there is a 
group G and consecutive numbers i, i + 1 such that G is not abelian but 
does have the property that (ab)! = a'b' and (ab)'*! = a‘*'b'*' for all 
a,b in G. 

5. Let G be a group in which (ab)? = a°*b? and (ab)? = a°b’ for alla, b € G. 
Show that G is abelian. 

6. Let G be a group in which (ab)” = ab" for some fixed integer n > 1 for 
all a, b € G. For all a, b € G, prove that: 

(a) (ab)""' = b"'a""!, 

(b) a"b"} a b"-1q". 

(c) (aba 'b"')""") =e, 

(Hint for Part (c): Note that (aba~')’ = ab’a™' for all integers r.] 
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3. SUBGROUPS 


In order for us to find out more about the makeup of a given group G, it may 
be too much of a task to tackle all of G head-on. It might be desirable to 
focus our attention on appropriate pieces of G, which are smaller, over which 
we have some control, and are such that the information gathered about 
them can be used to get relevant information and insight about G itself. The 
question then becomes: What should serve as suitable pieces for this kind of 
dissection of G? Clearly, whatever we choose as such pieces, we want them 
to reflect the fact that G is a group, not merely any old set. 

A group is distinguished from an ordinary set by the fact that it is en- 
dowed with a well-behaved operation. It is thus natural to demand that such 
pieces above behave reasonably with respect to the operation of G. Once 
this is granted, we are led almost immediately to the concept of a subgroup 
of a group. 


Definition. A nonempty subset, H, of a group G 1s called a subgroup 
of G if, relative to the product in G, H itself forms a group. 


We stress the phrase “relative to the product in G.” Take, for instance, 
the subset A = {1, —1} in Z, the set of integers. Under the multiplication of 
integers, A is a group. But A is not a subgroup of Z viewed as a group with 
respect to +. 

Every group G automatically has two obvious subgroups, namely G it- 
self and the subgroup consisting of the identity element, e, alone. These two 
subgroups we call trivial subgroups. Our interest will be in the remaining 
ones, the proper subgroups of G. 

Before proceeding to a closer look at the general character of sub- 
groups, we want to look at some specific subgroups of some particular, ex- 
plicit groups. Some of the groups we consider are those we introduced as ex- 
singe in Section 1; we maintain the numbering given there for them. In 
some of these examples we shall verify that certain specified subsets are indeed 
subgroups. We would strongly recommend that the reader carry out such a 
verification in lots of the others and try to find other examples for themselves. 

In trying to verify whether or not a given subset of a group is a sub- 
group, we are spared checking one of the axioms defining a group, namely 
the associative law. Since the associative law holds universally in a group G, 
given any subset A of G and any three elements of A, then the associative 
law certainly holds for them. So we must check, for a given subset A of G, 
whether A is closed under the operation of G, whether e is in A, and finally, 
given a € A, whether a“! is also in A. 
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Note that we can save one more calculation. Suppose that A C G is 
nonempty and that given a, b € A, then ab € A. Suppose further that given 
a € A, then a”! € A. Then we assert that e € A. For pick a € A; thena'€ A 
by supposition, hence aa”! € A, again by supposition. Since aa~' = e, we 


have that e € A. Thus a is a subgroup of G. In other words, 


Lemma 2.3.1. A nonempty subset A C G is a subgroup of G if and 
only if A is closed with respect to the operation of G and, given a € A, then 
a'EA. 


We now consider some examples. 


Examples 


1. Let G be the group Z of integers under + and let H be the set of even in- 
tegers. We claim that H is a subgroup of Z. Why? Is H closed, that 1s, given 
a,b € H,isa + b € H? In other words, if a, b are even integers, is a + b an 
even integer? The answer is yes, so H is certainly closed under +. Now to the 
inverse. Since the operation in Z is +, the inverse of a € Z relative to this op- 
eration is —a. If a € H, that is, if a is even, then —a is also even, hence 
—a © H.In short, H is a subgroup of Z under +. 


2. Let G once again be the group Z of integers under +. In Example 1, H, 
the set of even integers, can be described in another way: namely H consists 
of all multiples of 2. There is nothing particular in Example 1 that makes use 
of 2 itself. Let m > 1 be any integer and let H,, consist of all multiples of m 
in Z. We leave it to the reader to verify that H,,, is a subgroup of Z under +. 


3. Let S be any nonempty set and let G = A(S). If a € S, let H(a) = 
{f € A(S)| f(a) = a}. We claim that H(a) is a subgroup of G. For if 
f.g © H(a), then (fg)(a) = f(g(a)) = f(a) = a, since f(a) = g(a) = a. Thus 
fe © H(a). Also, if f € H(a), then f(a) = a, so that f-'(f(a)) = f~'(a). But 
f-'(f(a@)) = f-'(a) = i(a) = a. Thus, since a = f~'(f(a)) = f '(a), we have 
that f~' © H(a). Moreover, H is nonempty. (Why?) Consequently, H(a) is a 
subgroup of G. 


4. Let G be as in Example 6 of Section 1, and H as in Example 7. Then His a 
subgroup of G (see Problem 3 in Section 1). 


5. Let G be as in Example 6, H as in Example 7, and K as in Example 8 in 
Section 1. Then K C H C Gand K is a subgroup of both H and of G. 


6. Let C’ be the nonzero complex numbers as a group under the multiplica- 
tion of complex numbers. Let V = {a € C’ | |a| is rational}. Then V is a sub- 
group of C’. For if |a| and |b| are rational, then |ab| = |a| |b| is rational, so 
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ab € V; also, |a~'| = 1/|a| is rational, hence a“! € V. Therefore, V is a sub- 
group of C’. 


7. Let C' and V be as above and let 
U = {a€ C’| a = cos 6 + isin 6, @ any real}. 


If a = cos 6 + i sin 6 and b = cos & + i sin W, we saw in Chapter 1 that ab = 
cos(6 + w) + isin(6@ + WJ), so that ab € U, and that a~' = cos 6 — isin 6 = 
cos(—6) + i sin(—6) € U. Also, |a| = 1, since |a| = Vcos*6 + sin?@ = 1. 
Therefore, UC V C C’ and U is a subgroup both of V and of C’. 


8. Let C’, U, V be as above, and let n > 1 be an integer. Let 6, = 
cos(27/n) + i sin(27/n), and let B = {1, 6,, 02,..., 07° "}. Since 0” = 1 (as 
we saw by De Moivre’s Theorem), it is easily checked that B is a subgroup of 
U, V, and C’, and is of order n. 


9. Let G be any group and let a € G. The set A = {a'| i any integer} is a sub- 
group of G. For, by the rules of exponents, if a’ € A anda! € A, then a'a! = 
a'*i, so is in A. Also, (a')”'! = a~', so (a')"' € A. This makes A into a sub- 
group of G. 

A is the cyclic subgroup of G generated by a in the following sense. 


Definition. The cyclic subgroup of G generated by a is a set {a’| i any 
integer }. It is denoted (a). 


Note that if e is the identity element of G, then (e) = {e}. In Example 8, 
the group B is the cyclic group (6,,) of C generated by 9,. 


10. Let G be any group; for a € G let C(a) = {g € G| ga = ag}. We claim 
that C(a) is a subgroup of G. First, the closure of C(a). If g,h € C(a), then 
ga = ag and ha = ah, thus (gh)a = g(ha) = g(ah) = (ga)h = (ag)h = 
a(gh) {by the repeated use of the associative law), hence gh € C(a). Also, if 
g © ((a), then from ga = ag we have g '(ga)g"' = g"'(ag)g_', which 
simplifies to ag ' = g ‘a; whence g_' € C(a). So, C(a) is thereby a sub- 
group of G. 

These particular subgroups C(a) will come up later for us and they are 
given a special name. We call C(a) the centralizer of a in G. If in a group 
ab = ba, we say that a and b commute. Thus C(a) is the set of all elements in 
G that commute with a. 


11. Let G be any group and let Z(G) = {z € G| zx = xz for all x € G}. We 
leave it to the reader to verify that Z(G) is a subgroup of G. It is called the 
center of G. 
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12. Let G be any group and H a subgroup of G. For a € G, let a"'Ha = 
{a 'ha| h € H}. We assert that a~'Ha is a subgroup of G. If x = a~'h,a and 
y = a 'hya where h,, h, € H, then xy = (a 'h,a)(a'h,a) = a ‘(hy hy)a 
(associative law), and since H is a subgroup of G, h,h, € H. Therefore, 
a~'(h,h,)a € a~'Ha, which says that xy € a” 'Ha. Thus a™!Ha is closed. 
Also, if x = a~'ha € a™'Ha, then, as is easily verified, x~' = (a-'ha)"! = 
a~'h~'a € a" 'Ha. Therefore, a~'Ha is a subgroup of G. 


An even dozen seems to be about the right number of examples, so we 
go on to other things. Lemma 2.3.1 points out for us what we need in order 
that a given subset of a group be a subgroup. In an important special case we 
can make a considerable saving in checking whether a given subset H is a 
subgroup of G. This is the case in which H is finite. 


Lemma 2.3.2. Suppose that G is a group and H a nonempty finite sub- 
set of G closed under the product in G. Then H is a subgroup of G. 


Proof. By Lemma 2.3.1 we must show that a € H implies a~' € H. If 
a =e,then a ' = e and we are done. Suppose then that a # e ; consider 
the elements a, a*, ... ,a"*', where n = |H|, the order of H. Here we 
have written down n + 1 elements, all of them in H since H is closed, and H 
has only n distinct elements. How can this be? Only if some two of the ele- 


ments listed are equal; put another way, only if a’ = a/ for some 1 Si < 


j =n +1. But then, by the cancellation property in groups, a’~ ' = e. Since 
j —i21,a’~' € H, hence e € H. However, j -i-1=20,soa’ ''E€H 
and aa’~'~' = a!~' = e, whence a! = a/~'"! € H. This proves the 
lemma. [1] 


An immediate, but nevertheless important, corollary to Lemma 2.3.2 is the 


Corollary. If Gis a finite group and H a nonempty subset of G closed 
under multiplication, then H is a subgroup of G. 


PROBLEMS 
Easier Problems 


1. If A, B are subgroups of G, show that A /M B is a subgroup of G. 

2. What is the cyclic subgroup of Z generated by —1 under +? 

3. Let S3 be the symmetric group of degree 3. Find all the subgroups of S3. 
4. Verify that Z(G), the center of G, is a subgroup of G. (See Example 11.) 
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5. 


If C(a) is the centralizer of a in G (Example 10), prove that Z(G) = 
NaegC(a). 


. Show that a € Z(G) if and only if C(a) = G. 


7. In S3, find C(a) for eacha € $3. 

8. If Gis an abelian group and if H = {a € G|a* = e}, show that H isa 
subgroup of G. 

9. Give an example of a nonabelian group for which the H in Problem 8 is 


"11. 


*12. 
13. 
14. 
15. 


not a subgroup. 


. If G is an abelian group and n > 1 an integer, let A, = {a”|a € G}. 


Prove that A, is a subgroup of G. 
If G is an abelian group and H = {a € G| a" = e for some n(a) > 1 de- 
pending on a}, prove that H is a subgroup of G. 


We say that a group G is cyclic if there exists an a € G such that 
every x € Gis a power of a, that is, x = a’ for some j. In other words, G 
is cyclic if G = (a) for some a € G, in which case we say that a is a gen- 
erator for G. 

Prove that a cyclic group is abelian. 

If G is cyclic, show that every subgroup of G 1s cyclic. 

If G has no proper subgroups, prove that G 1s cyclic. 

If G is a group and H a nonempty subset of G such that, given a, b € H, 
then ab~' € H, prove that H is a subgroup of G. 


Middle-Level Problems 


*16 


17. 


18. 


19. 


20. 


21. 


* 22. 


If G has no proper subgroups, prove that G is cyclic of order p, where p 
is a prime number. (This sharpens the result of Problem 14.) 

If G is a group and a, x € G, prove that C(x 'ax) = x~'C(a)x. [See Ex- 
amples 10 and 12 for the definitions of C (b) and of x~'C (a)x.] 

If S is anonempty set and X C S, show that T(X) = {f € A(S) | f(X) C 
X } is asubgroup of A(S) if X is finite. 

If A, B are subgroups of an abelian group G, let AB = {ab| a € A, b € B}. 
Prove that AB is a subgroup of G. 

Give an example of a group G and two subgroups A, B of G such that 
AB is not a subgroup of G. 

If A, B are subgroups of G such that b~'Ab C A for all b € B, show that 
AB is a subgroup of G. 

If A and B are finite subgroups, of orders m and n, respectively, of the 
abelian group G, prove that AB is a subgroup of order mn if m and n are 
relatively prime. 
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23. What is the order of AB in Problem 22 if m and n are not relatively prime? 


24. If H is a subgroup of G, let N = N,<¢¢x ‘Hx. Prove that N is a subgroup 
of G such that y_'Ny = N for every y € G. 


Harder Problems 


25. Let S, X, T(X) be as in Problem 18 (but X no longer finite). Give an ex- 
ample of a set S and an infinite subset X such that T(X) is not a sub- 
group of A(S). 

«26. Let G be a group, H a subgroup of G. Let Hx = {hx | h € H}. Show that, 
given a, b € G, then Ha = Hb or Ha NM Hb = @. 

*27. If in Problem 26 H is a finite subgroup of G, prove that Ha and Hb have 
the same number of elements. What is this number? 

28. Let M, N be subgroups of G such that x~'Mx C M and x7 'Nx C N for 
all x € G. Prove that MN is a subgroup of G and that x-'(MN)x C MN 
for all x € G. 

*29. If M is a subgroup of G such that x~'Mx C M for all x € G, prove that 
actually x 'Mx = M. 

30. If M, N are such that x 'Mx = M and x” '!Nx = N for all x © G and if 
M 1 N = (e), prove that mn = nm for any m € M,n € N. (Hint: Con- 
sider the element m™'n7'mn.) 


4. LAGRANGE’S THEOREM 


We are about to derive the first real group-theoretic result of importance. 
Although its proof is relatively easy, this theorem is like the A-B-C’s for fi- 
nite groups and has interesting implications in number theory. 

As a matter of fact, those of you who solved Problems 26 and 27 of Sec- 
tion 3 have all the necessary ingredients to effect a proof of the result. The 
theorem simply states that in a finite group the order of a subgroup divides 
the order of the group. 

To smooth the argument of this theorem—which is due to Lagrange— 
and for use many times later, we make a short detour into the realm of set 
theory. 

Just as the concept of “function” runs throughout most phases of math- 
ematics, so also does the concept of “relation.” A relation 1s a statement aRb 
about the elements a, b € S. If S is the set of integers, a = b is a relation 
on S. Similarly, a < bis a relation on S, as isa = b. 
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Definition. A relation ~ ona set S is called an equivalence relation if, 
for all a, b, c € S, it satisfies: 


(a) a~ a (reflexivity). 
(b) a ~ b implies that b ~ a (symmetry). 
(c) a~ b, b ~ c implies that a ~ c (transitivity). 


Of course, equality, =, is an equivalence relation, so the general notion 
of equivalence relation is a generalization of that of equality. In a sense, an 
equivalence relation measures equality with regard to some attribute. This 
vague remark may become clearer after we see some examples. 


Examples 


1. Let S be all the items for sale in a grocery store; we declare a ~ b, for 
a, b € S, if the price of a equals that of b. Clearly, the defining rules of an 
equivalence relation hold for this ~. Note that in measuring this “generalized 
equality” on S we ignore all properties of the elements of S other than their 
price. So a ~ b if they are equal as far as the attribute of price is concerned. 


2. Let S be the integers and n > 1 a fixed integer. We define a ~ b for a, 
b € Sif n|(a — b). We verify that this is an equivalence relation. Since n | 0 
and 0 = a — a, we have a ~ a. Because n | (a — b) implies that n | (b — a), we 
have that a ~ b implies that b ~ a. Finally, if a ~ b and b ~ c, then 
n|(a — b) andn|(b — c); hence n| ((a — b) + (b — c)), that is, n| (a — c). 
Therefore, a ~ c. 

This relation on the integers is of great importance in number theory 
and is called congruence modulo n; when a ~ b, we write this asa = b modn 
[or, sometimes, as a = b(n)], which is read “a congruent to b mod n.” We'll 
be running into it very often from now on. As we shall see, this is a special 
case of a much wider phenomenon in groups. 


3. We generalize Example 2. Let G be a group and H a subgroup of G. For 
a, b € G, define a ~ b if ab"! € H. Since e € H and e = aa™!, we have that 
a ~ a. Also, if ab~' € H, then since H is a subgroup of G, (ab~')~' € H. But 
(ab~')~' = (b™')"'a~! = ba™', so ba™' € H, hence b ~ a. This tells us that 
a ~ b implies that b ~ a. Finally, if a ~ b and b ~ c, then ab~' € H and 
bc! € H. But (ab~')(bc"!) = ac™', whence ac’' € H and therefore 
a ~ c. We have shown the transitivity of ~, thus ~ is an equivalence relation 
on G. 

Note that if G = Z, the group of integers under +, and AH is the sub- 
group consisting of all multiples of n, for n > 1 a fixed integer, then ab-' € H 


58 Groups Ch. 2 


translates into a = b(n). So congruence mod n is a very special case of the 
equivalence we have defined in Example 3. 

It is this equivalence relation that we shall use in proving Lagrange’s 
theorem. 


4. Let G be any group. For a, b © G we declare that a ~ b if there exists an 
x € G such that b = x ‘ax. We claim that this defines an equivalence rela- 
tion on G. First, a ~ a for a = e 'ae. Second, if a ~ b, then b = x7 ‘ax, hence 
a = (x_')"'b(x~'), so that b ~ a. Finally, if a ~ b, b ~ c, then b = x7 'ax,c = 
y 'by for some x, y © G. Thus c = y_'(x7!ax)y = (xy) 'a(xy), and soa ~ c. 
We have established that this defines an equivalence relation on G. 

This relation, too, plays an important role in group theory and is given 
the special name conjugacy. When a ~ b we say that “a and b are conjugate 
in G.” Note that if G is abelian, then a ~ b if and only if a = b. 


We could go on and on to give numerous interesting examples of equiva- 
lence relations, but this would sidetrack us from our main goal in this section. 
There will be no lack of examples in the problems at the end of this section. 

We go on with our discussion and make the 


Definition. If ~ is an equivalence relation on S, then [a], the class of 
a, is defined by [a] = {b € S| b ~ a}. 


Let us see what the class of a is in the two examples, Examples 3 and 4, 
just given. 

In Example 3, a ~ b if ab"' € H, that is, if ab~' = h, for some h € H. 
Thus a ~ b implies that a = hb. On the other hand, if a = kb where k € H, 
then ab~' = (kb)b-' = k € H, soa ~ bif and only if a € Hb = {hb|hE H}. 
Therefore, [b] = Hb. 

The set Hb is called a right coset of H in G. We ran into such in Prob- 
lem 26 of Section 3. Note that b € Hb, since b = eb and e € H (also because 
b € [b] = Hb). Right cosets, and left handed counterparts of them called left 
cosets, play important roles in what follows. 

In Example 4, we defined a ~ b if b = x~'ax for some x € G. Thus [a] = 
{x~'ax |x € G}. We shall denote [a] in this case as cl(a) and call it the conju- 
gacy class of a in G. If G is abelian, then cl(a) consists of a alone. In fact, if 
a € Z(G), the center of G, then cl(a) consists merely of a. 

The notion of conjugacy and its properties will crop up again often, es- 
pecially in Section 11. 

We shall examine the class of an element a in Example 2 later in this 
chapter. 

The important influence that an equivalence relation has on a set is to 
break it up and partition it into nice disjoint pieces. 
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Theorem 2.4.1. If ~ is an equivalence relation on S, then S = U[al], 
where this union runs over one element from each class, and where [a] # [5] 
implies that [a] N [b] = ©. That is, ~ partitions S into equivalence classes. 


Proof. Since a € [a], we have U,¢s5[a] = S. The proof of the second as- 
sertion is also quite easy. We show that if [a] # [b], then [a] N [b] = @, or, 
what is equivalent to this, if [a] N [b] # ©, then [a] = [Db]. 

Suppose, then, that [a] M [b] # @; let c € [a] N [5]. By definition of 
class, c ~ a since c € [a] and c ~ b since c € [b]. Therefore, a ~ c by sym- 
metry of ~, and so, since a ~ c and c ~ b, we have a ~ b. Thus a € [b]; if 
x € [a], then x ~ a, a ~ b gives us that x ~ b, hence x € [b]. Thus [a] C [b]. 
The argument is obviously symmetric in a and b, so we have [b] C [a], 
whence [a] = [b], and our assertion above is proved. 

The theorem is now completely proved. L] 


We now can prove a famous result of Lagrange. 


Theorem 2.4.2 (Lagrange’s Theorem). If G is a finite group and H is 
a subgroup of G, then the order of H divides the order of G. 


Proof. Let us look back at Example 3, where we established that the 
relation a ~ b if ab~' € H is an equivalence relation and that 


[a] = Ha = {hal|he H}. 


Let k be the number of distinct classes—call them Ha,, ... , Ha,. By Theo- 
rem 2.4.1,G = Ha, U Ha, U-:: U Ha, and we know that Ha; M Ha; = © if 
i # j. 

We assert that any Ha; has |H| = order of H number of elements. Map 
H — Ha; by sending h — ha;. We claim that this map is 1-1, for if ha; = 
h'a;,then by cancellation in G we would have h = h'; thus the map is 1-1. It 
is definitely onto by the very definition of Ha;. So H and Ha; have the same 
number, |H|, of elements. 

Since G = Ha, U-:: U Ha, and the Ha; are disjoint and each Ha; has 
|H| elements, we have that |G| = k|H|. Thus |H| divides |G| and La- 
grange’s Theorem is proved. [] 


Although Lagrange sounds like a French name, J. L. Lagrange (1736-1813) was 
actually Italian, having been born and brought up in Turin. He spent most of 
his life, however, in France. Lagrange was a great mathematician who made 
fundamental contributions to all the areas of mathematics of his day. 


If G is finite, the number of right cosets of H in G, namely |G|/|H|, is 
called the index of H in G and is written as ic (#2). 
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Recall that a group G is said to be cyclic if there is an element a € G 
such that every element in G is a power of a. 


Theorem 2.4.3. A group G of prime order is cyclic. 


Proof. If H is a subgroup of G then, by invoking Lagrange’s Theorem, 
|H| divides |G| = p, p a prime, so | H| = 1 or p. So if H # (e), then H = G. If 
a © G,a # e, then the powers of a form a subgroup (a) of G different from 
(e). So this subgroup is all of G. This says that any x € G is of the form x = a’. 
Hence, G is cyclic by the definition of cyclic group. LF 


If G is finite and a € G, we saw earlier in the proof of Lemma 2.3.2 that 
a" = e for some n(a) = 1, depending on a. We make the 


Definition. If G is finite, then the order of a, written o(a), is the least 
positive integer m such that a” = e. 


ane 


9 


Suppose that a € G has order m. Consider the set A = {e, a, a*,...,a 
we claim that A is a subgroup of G (since a” = e) and that the m elements 
listed in A are distinct. We leave the verification of these claims to the 
reader. Thus |A| = m = o(a). Since |A| | |G|, we have 


Theorem 2.4.4. If G is finite and a € G, then o(a) | |G|. 


If a € G, where G is finite, we have, by Theorem 2.4.4, |G| = k - o(a). 


Thus 


aiOl = gk = (qr )k = ek = @, 
We have proved the 


Theorem 2.4.5. If G is a finite group of order n, then a” = e for all 
ac G. 


When we apply this last result to certain special groups arising in num- 
ber theory, we shall obtain some classical number-theoretic results due to 
Fermat and Euler. 

Let Z be the integers and let n > 1 be a fixed integer. We go back to 
Example 2 of equivalence relations, where we defined a = b mod n (a 
congruent to b modn) if n|(a — b). The class of a, [a], consists of all 
a + nk, where k runs through all the integers. We call it the congruence class 
of a. 
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By Euclid’s Algorithm, given any integer b, b = qn + r, where 
0 =r<_n, thus [b] = [r]. So the n classes [0], [1], ..., [” — 1] give us all 
the congruence classes. We leave it to the reader to verify that they are 
distinct. 

Let Z,, = {[0], [1], ..., [” — 1]}. We shall introduce two operations, + 
and - in Z,,. Under + Z,, will form an abelian group; under - Z,, will not form 
a group, but a certain piece of it will become a group. 

How to define [a] + [b]? What is more natural than to define 


[a] + [b] = [a + b]. 


But there is a fly in the ointment. Is this operation + in Z, well-defined? 
What does that mean? We can represent [a] by many a’s—for instance, if 
n = 3, [1] = [4] = [-2] =---, yet we are using a particular a to define the 
addition. What we must show is that if [a] = [a’] and [b] = [b’], then [a + 5] 
= [a’' + b'], for then we will have [a] + [b] = [a + b] = [a’ + b’] = 
[a'] + [b’]. 

Suppose that [a] = [a’]; then n|(a — a’). Also from [b] = [b’], 
n|(b — b'), hence n| ((a — a') + (b — b')) = ((a + b) — (a' + b’)). There- 
fore,a + b =(a’ + b’)modn, and so [a+ b] = [a’ + b’]. 

So we now have a well-defined addition in Z,,. The element [0] acts as 
the identity element and [—a] acts as —[a], the inverse of [a]. We leave it to 
the reader to check out that Z, is a group under +. It is a cyclic group of 
order n generated by [1]. 

We summarize this all as 


Theorem 2.4.6. Z,, forms a cyclic group under the addition [a] + [b] = 
[a + b]. 


Having disposed of the addition in Z,,, we turn to the introduction of a 
multiplication. Again, what is more natural than defining 


[a] - [b] = [ab]? 


So, for instance, if m = 9, [2][7] = [14] = [5], and [3][6] = [18] = [0]. Under 
this multiplication—we leave the fact that it is well-defined to the reader— 
Z,, does not form a group. Since [0][a] = [0] for all a, and the unit element 
under multiplication is [1], [0] cannot have a multiplicative inverse. Okay, 
why not try the nonzero elements [a] # [0] as a candidate for a group under 
this product? Here again it is no go ifn is not a prime. For instance, if n = 6, 
then [2] # [0], [3] # [0], yet [2][3] = [6] = [0], so the nonzero elements do 
not, in general, give us a group. 

So we ask: Can we find an appropriate piece of Z,, that will form a 
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group under multiplication? Yes! Let U, = {[a] € Z,,| (a, n) = 1}, noting 
that (a, n) = 1 if and only if (b, n) = 1 for [a] = [b]. By the Corollary to 
Theorem 1.5.5, if (a, n) = 1 and (5, n) = 1, then (ab, n) = 1. So [a][b] = [ab] 
yields that if [a], [b] € U,,, then [ab] € U,, and U,, is closed. Associativity is 
easily checked, following from the associativity of the integers under multi- 
plication. The identity element is easy to find, namely [1]. Multiplication is 
commutative in U,,. 

Note that if [a][b] = [a][c] where [a] € U,, then we have [ab] = [ac], 
and so [ab — ac] = [0]. This says that n | a(b — c) = ab — ac; but a is rela- 
tively prime to n. By Theorem 1.5.5 one must have that n|(b — c), and so 
[b] = [c]. In other words, we have the cancellation property in U,,. By Prob- 
lem 2 of Section 2, U,, is a group. 

What is the order of U,,? By the definition of U,,, |U,,| = number of in- 
tegers 1 = m <n such that (m, n) = 1. This number comes up often and we 
give it a name. 


Definition. The Euler g-function, y(n), is defined by ¢(1) = 1 and, 
for n > 1, y(n) = the number of positive integers m with 1 = m <n such 
that (m, n) = 1. 


Thus |U,,| = g(n). If n = p, a prime, we have ¢g(p) = p — 1. We see 
that ¢(8) = 4 for only 1, 3, 5, 7 are less than 8 and positive and relatively 
prime to 8. We try another one, ¢(15). The numbers 1 = m < 15 relatively 
prime to 15 are 1, 2, 4, 7, 8, 11, 13, 14, so g(15) = 8. 

Let us look at some examples of U,,. 


1. Us = {{1], [3], [5], [7]}. Note that [3][5] = [15] = [7], [5]* = [25] = [1]. In 
fact, U, is a group of order 4 in which a’ = e for every a € Ug. 


2. Ui; = {[1], [2], [4], [7], [8], [11], [13], [14]}. Note that [11][13] = [143] = 
[8], [2]* = [1], and so on. 


The reader should verify that a* = e = [1] for every a € U,;. 


3. Uy = {[1], [2], [4]. [5], [7], [8]}. Note that [2]" = [2], [2]° = [4], [2]° = [8], 
[2°] = [16] = [7], [2P = [32] = [5]; also [2]° = [2][2]° = [2][5] = [10] = [1]. 
So the powers of [2] give us every element in Uy. Thus Us is a cyclic group of 
order 6. What other elements in U, generate U,? 


In parallel to Theorem 2.4.6 we have 


Theorem 2.4.7. U, forms an abelian group, under the product 
[a][b] = [ab], of order y(n), where y(n) is the Euler g-function. 
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An immediate consequence of Theorems 2.4.7 and 2.4.5 is a famous re- 
sult in number theory. 


Theorem 2.4.8 (Euler). If a is an integer relatively prime to n, then 
a?) = 1 modn. 


Proof. U,, forms a group of order g(n), so by Theorem 2.4.5, a? = e 
for all a € U,,. This translates into [a®] = [a]® = [1], which in turn trans- 
lates into n|(a® — 1) for every integer a relatively prime to p. In other 
words, a?” = 1 moda. 


A special case, where n = p is a prime, is due to Fermat. 


Corollary (Fermat). If p is a prime and pj] a, then 
a?~' =1 mod p. 
For any integer b, b? = b mod p. 


Proof. Since g(p) = p — 1, if (a, p) = 1, we have, by Theorem 2.4.8, 
that a’~' = 1(p), hence a! - a?~' = a(p), so that a? = a(p). If p| b, then 
b = 0(p) and b? = 0(p), so that b? = b(p).U 


Leonard Euler (1707-1785) was probably the greatest scientist that Switzerland 
has produced. He was the most prolific of all mathematicians ever. 

Pierre Fermat (1601-1665) was a great number theorist. Fermat’s Last The- 
orem—which was in fact first proved in 1994 by Andrew Wiles—states that the 
equation a” + b”" = c" (a, b, c, n being integers) has only the trivial solution where 
a=OQOorb=0Oorc= 0ifn>2. 


One final cautionary word about Lagrange’s Theorem. Its converse in 
general is not true. That is, if G is a finite group of order n, then it need not 
be true that for every divisor m of n there is a subgroup of G of order m. A 
group with this property is very special indeed, and its structure can be 
spelled out quite well and precisely. 


PROBLEMS 
Easier Problems 


1. Verify that the relation ~ is an equivalence relation on the set S given. 
(a) S = R reals, a ~ b if a — bis rational. 
(b) S = C, the complex numbers, a ~ b if |a| = |b]. 
(c) S = straight lines in the plane, a ~ b if a, b are parallel. 
(d) S = set of all people, a ~ b if they have the same color eyes. 
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. The relation ~ on the real numbers R defined by a ~ b if both a > b and 


b > a is not an equivalence relation. Why not? What properties of an 
equivalence relation does it satisfy? 


. Let ~ be a relation on a set S that satisfies (1) a ~ b implies that b ~ a 


and (2) a ~ b and b ~ c implies that a ~ c. These seem to imply that 
a ~ a. For if a ~ b, then by (1), b ~ a,so a ~ b, b ~ a, so by (2), a ~ a. If 
this argument is correct, then the relation ~ must be an equivalence rela- 
tion. Problem 2 shows that this is not so. What is wrong with the argu- 
ment we have given? 


. Let S be a set, {S,} nonempty subsets such that S = U,S, and S$, M Sz = 


© if a # B. Define an equivalence relation on S in such a way that the S, 
are precisely all the equivalence classes. 

Let G be a group and H a subgroup of G. Define, for a, b € G, a ~ b if 
a~'b € H. Prove that this defines an equivalence relation on G, and show 
that [a] = aH = {ah|h € H). The sets aH are called left cosets of H 
in G. 


. If G is S; and H = {i, f}, where f:S — S is defined by f(x,) = 


Xo, f(X2) = x1, f(x3) = X3, list all the right cosets of H in G and list all 
the left cosets of H in G. 


. In Problem 6, is every right coset of H in G also a left coset of H in G? 
. If every right coset of H in G is a left coset of H in G, prove that 


aHa™' = H for alla € G. 


. In Z,¢, write down all the cosets of the subgroup H = {[0], [4], [8], [12]}. 


(Since the operation in Z, is +, write your coset as [a] + H. We don’t 
need to distinguish between right cosets and left cosets, since Z,, is 
abelian under +.) 


. In Problem 9, what is the index of H in Z,,? (Recall that we defined the 


index ig(#7) as the number of right cosets in G.) 


. For any finite group G, show that there are as many distinct left cosets of 


Hin G as there are right cosets of H in G. 


. If aH and DH are distinct left cosets of H in G, are Ha and Hb distinct 


right cosets of H in G? Prove that this is true or give a counterexample. 


. Find the orders of all the elements of U,,. Is U;, cyclic? 
14, 


Find the orders of all the elements of U3). Is Uy, cyclic? 

If p is a prime, show that the only solutions of x7 = 1 mod p are x = 
1 mod p or x = —1 mod p. 

If G is a finite abelian group and a,,..., a, are all its elements, show 
that x = a,a,---a, must satisfy x” = e. 

If G is of odd order, what can you say about the x in Problem 16? 
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18. 


19. 


20. 


21. 


22. 
23. 


24. 


25. 


Using the results of Problems 15 and 16, prove that if p is an odd prime 
number, then (p — 1)! = —1 mod p. (This is known as Wilson’s Theo- 
rem.) It is, of course, also true if p = 2. 

Find all the distinct conjugacy classes of S3. 

In the group G of Example 6 of Section 1, find the conjugacy class of the 
element 7, ,. Describe it in terms of a and D. 

Let G be the dihedral group of order 8 (see Example 9, Section 1). Find 
the conjugacy classes in G. 

Verify Euler’s Theorem for n = 14 and a = 3, and forn = 14 anda = 5. 
In U,,, show that there is an element a such that [a]? = [—1], that is, an 
integer a such that a* = —1 mod 41. 

If p is a prime number of the form 4n + 3, show that we cannot solve 


x? = ~—1modp 


([Hint: Use Fermat’s Theorem that a’?~'! = 1 mod p if pj a.] 


Show that the nonzero elements in Z, form a group under the product 
[a][b] = [ab] if and only if n is a prime. 


Middle-Level Problems 


26. 


27. 
28. 


29. 


30. 
*31. 
32. 


33. 


34. 


Let G be a group, H a subgroup of G, and let S be the set of all distinct 
right cosets of H in G, T the set of all left cosets of H in G. Prove that there 
is a 1-1 mapping of S onto 7. (Note: The obvious map that comes to mind, 
which sends Ha into aH, is not the right one. See Problems 5 and 12.) 

If aH = bH forces Ha = Hb in G, show that aHa™' = H for everya € G. 
If G is a cyclic group of order n, show that there are y(n) generators for 
G. Give their form explicitly. 

If in a group G, aba™' = b', show that a’ba~" = b’’ for all positive inte- 
gers r. 

If in Ga° = e and aba ' = b’, find o(b) if b # e. 

If o(a) = mand a* = e, prove that m | s. 

Let G be a finite group, H a subgroup of G. Let f(a) be the least positive 
m such that a” € H. Prove that f(a) | o(a). 

If i # f © A(S) is such that f? = i, p a prime, and if for some 
s ES, fi(s) = s for some 1 <j < p, show that f(s) = s. 

If f € A(S) has order p, p a prime, show that for every s € S the orbit of 
s under f has one or p elements. [Recall: The orbit of s under f is 
{f/(s) | j any integer}. ] 
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If f © A(S) has order p, p a prime, and S is a finite set having n elements, 
where (n, p) = 1, show that for some s € S, f(s) =s. 


Harder Problems 


36. 


37. 


38. 
39. 


40. 


41. 


42. 


43. 


5. 


If a > 1 is an integer, show that n| g(a" — 1), where ¢ is the Euler 
y-function. [Hint: Consider the integers mod(a” — 1).] 

In a cyclic group of order n, show that for each positive integer m that di- 
vides n (including m = 1 and m = n) there are ¢y(m) elements of order m. 
Using the result of Problem 37, show that n = 2,,),¢(m). 

Let G be a finite abelian group of order n for which the number of solu- 
tions of x” = e is at most m for any m dividing n. Prove that G must be 
cyclic. [Hint: Let y(m) be the number of elements in G of order m. Show 
that w(m) = ¢g(m) and use Problem 38.] 

Using the result of Problem 39, show that U,, if p is a prime, is cyclic. 
(This is a famous result in number theory; it asserts the existence of a 
primitive root mod p.) 

Using the result of Problem 40, show that if p is a prime of the form 
p = 4n + 1, then we can solve x? = —1 mod p (with x an integer). 

Using Wilson’s Theorem (see Problem 28), show that if p is a prime of 
the form p = 4n + 1 and if 


ya1-2-3--25 = (258) 


2 2 
then y* = —1 mod p. (This gives another proof of the result in Problem 
41.) 
Let G be an abelian group of order n, and a,,..., a, its elements. Let 


x = a,a,:--a,,. Show that: 

(a) If G has exactly one element b # e such that b* = e, then x = b. 
(b) If G has more than one element b + e such that b” = e, then x = e. 
(c) If n is odd, then x = e (see Problem 16). 


HOMOMORPHISMS AND NORMAL SUBGROUPS 


In a certain sense the subject of group theory is built up out of three basic 
concepts: that of a homomorphism, that of a normal subgroup, and that of 
the factor or quotient group of a group by a normal subgroup. We discuss the 
first two of these in this section, and the third in Section 6. 


Without further ado we introduce the first of these. 


Sec. 5 Homomorphisms and Normal Subgroups 67 


Definition. Let G, G’' be two groups; then the mapping ¢: G — G’ is 
a homomorphism if g (ab) = ¢(a)¢(b) for all a, b € G. 
(Note: This gy has nothing to do with the Euler g-function.) 


In this definition the product on the left side—in g(ab)—is that of G, 
while the product ¢(a)¢g(b) is that of G’. A short description of a homomor- 
phism is that it preserves the operation of G. We do not insist that ¢ be onto; if 
it is, we'll say that it is. Before working out some facts about homomorphisms, 
we present some examples. 


Examples 


1. Let G be the group of all positive reals under the multiplication of reals, 
and G’ the group of all reals under addition. Let gp: G — G’ be defined by 
p(x) = log;)x for x € G. Since log, (xy) = log; 9x + logigy, we have 
p(xy) = g(x) + ¢(y), so g is a homomorphism. It also happens to be onto 
and 1-1. 


2. Let G be an abelian group and let ¢: G > G be defined by g(a) = a’. 
Since g(ab) = (ab)* = a*b” = ¢(a) ¢(b), ¢ is a homomorphism of G into it- 
self. It need not be onto; the reader should check that in Us (see Section 4) 
a’ = e for alla € Us, so g(G) = (e). 


3. The example of Ug above suggests the so-called trivial homomorphism. Let 
G be any group and G’ any other; define g(x) = e’, the unit element of G’, 
for all x € G. Trivially, g is a homomorphism of G into G’. It certainly is not 
a very interesting one. 

Another homomorphism always present is the identity mapping, i, of any 
group G into itself. Since i(x) = x for all x € G, clearly i(xy) = xy = i(x)i(y). 
The map / is 1-1 and onto, but, again, is not too interesting as a homomor- 
phism. 


4. Let G be the group of integers under + and G’ = {1, —1}, the subgroup of 
the reals under multiplication. Define g(m) = 1 if m is even, p(m) = —1 ifm 
is odd. The statement that gis a homomorphism is merely a restatement of: 


even + even = even, even + odd = odd, and odd + odd = even. 


5. Let G be the group of all nonzero complex numbers under multiplication 
and let G’ be the group of positive reals under multiplication. Let ¢: G > G' 
be defined by g(a) = |a|; then (ab) = |ab| = |a| |b| = g(a)e(b), so gisa 
homomorphism of G into G’. In fact, ¢ is onto. 


6. Let G be the group in Example 6 of Section 1, and G’ the group of 
nonzero reals under multiplication. Define ¢: G — G' by ¢(T,,) = a. That 
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g is ahomomorphism follows from the product rule in G, namely, 7, ,7..4 = 
Le, ad+b:* 


7. Let G = Z be the group of integers under + and let G’ = Z,. Define 
y:G—Z, by g(m) = [m]. Since the addition in Z,, is defined by [m] + [r] = 
[m + r], we see that g(m + r) = g(m) + ¢(r), so ¢ is indeed a homomor- 
phism of Z onto Z,,. 


8. The following general construction gives rise to a well-known theorem. 
Let G be any group, and let A(G) be the set of all 1-1 mappings of G onto it- 
self—here we are viewing G merely as a set, forgetting about its multiplication. 
Define T,: G — G by T,,(x) = ax for every x € G. What is the product, T,T,, of 
T,, and T, as mappings on G? Well, 


(T.T,)(x) = T.(T,x) = T,(bx) = a(bx) = (ab)x = Tip(x) 


(we used the associative law). So we see that T, 7, = T,,,. 

Define the mapping g: G — A(G) by ¢(a) = T,, for a € G. The prod- 
uct rule for the 7’s translates into g(ab) = T,, = T,T, = ¢(a)¢(b), so gisa 
homomorphism of G into A(G). We claim that ¢ is 1-1. Suppose that g(a) = 
g(b), that is, 7, = T,. Therefore, a = T,(e) = T,(e) = b, so ¢ is indeed 1-1. 
It is not onto in general—for instance, if G has order n > 2, then A(G) has 
order n!, and since n! > n, g doesn’t have a ghost of a chance of being onto. 
It is easy to verify that the image of gy, o(G) = {T,|a € G}, is a subgroup 
of A(G). 

The fact that ¢ is 1-1 suggests that perhaps 1-1 homomorphisms should 
play a special role. We single them out in the following definition. 


Definition. The homomorphism ¢:G — G' is called a monomor- 
phism if gis 1-1. A monomorphism that ts onto is called an isomorphism. An 
isomorphism from G to G itself is called an automorphism. 


One more definition. 


Definition. Two groups G and G’ are said to be isomorphic if there is 
an isomorphism of G onto G’. 
We shall denote that G and G’ are isomorphic by writing G = G’. 


This definition seems to be asymmetric, but, in point of fact, it is not. 
For if there is an isomorphism of G onto G’, there is one of G' onto G (see 
Problem 2). 

We shall discuss more thoroughly later what it means for two groups to 
be isomorphic. But now we summarize what we did in Example 8. 
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Theorem 2.5.1 (Cayley’s Theorem). Every group G is isomorphic to 
some subgroup of A(S), for an appropriate S. 


The appropriate S we used was G itself. But there may be better 
choices. We shall see some in the problems to follow. 

When G is finite, we can take the set S in Theorem 2.5.1 to be finite, in 
which case A(S) is S, and its elements are permutations. In this case, Cay- 
ley’s Theorem is usually stated as: A finite group can be represented as a 
group of permutations. 


(Arthur Cayley (1821-1895) was an English mathematician who worked in ma- 
trix theory, invariant theory, and many other parts of algebra.) 


This is a good place to discuss the importance of “isomorphism.” Let ¢ 
be an isomorphism of G onto G’. We can view G’ as a relabeling of G, 
using the label g(x) for the element x. Is this labeling consistent with the 
structure of G as a group? That is, if x is labeled g(x), y labeled y(y), what 
is xy labeled as? Since g(x)y(y) = g(xy), we see that xy is labeled as 
y(x)¢(y), so this renaming of the elements is consistent with the product in 
G. So two groups that are isomorphic—although they need not be equal—in 
a certain sense, as described above, are equal. Often, it is desirable to be able 
to identify a given group as isomorphic to some concrete group that we 
know. 

We go on with more examples. 


9, Let G be any group, a € G fixed in the discussion. Define ¢: G > G by 
g(x) = a_‘xa for all x € G. We claim that ¢ is an isomorphism of G onto 
itself. First, 


(xy) = a '(xy)a = a'xa-a™'ya = o(x)e(y), 


so ¢ is at least a homomorphism of G into itself. It is 1-1 for if g(x) = ¢()y), 
then a 'xa = a 'ya, so by cancellation in G we get x = y. Finally, ¢ is onto, 
for x = a '(axa™')a = y(axa™') for any x € G. 
Here ¢ is called the inner automorphism of G induced by a. The notion 
of automorphism and some of its properties will come up in the problems. 
One final example: 


10. Let G be the group of reals under + and let G’ be the group of all 
nonzero complex numbers under multiplication. Define g: G — G’ by 


gy(x) = cosx + isin x. 


We saw that (cos x + i sin x)(cos y + i sin y) = cos(x + y) + isin(x + y), 
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hence ¢(x)g(y) = g(x + y) and gis a homomorphism of G into G’. ¢ is not 
1-1 because, for instance, (0) = g(27) = 1, nor is ¢g onto. 


Now that we have a few examples in hand, we start a little investigation 
of homomorphisms. We begin with 


Lemma 2.5.2. If g is ahomomorphism of G into G’, then: 
(a) y(e) = e’, the unit element of G’. 


(b) g(a!) = g(a)! for alla E G. 


Proof. Since x = xe, p(x) = g(xe) = ¢(x)¢(e); by cancellation in G' 
we get o(e) = e’. Also, g(aa') = g(e) = e’, hence e’ = g(aa™') = 
y(a)¢(a_'), which proves that g(a!) = g(a) '. 


Definition. The image of ¢, ¢(G), is e(G) = {e(a)|a € G}. 
We leave to the reader the proof of 


Lemma 2.5.3. If g is a homomorphism of G into G’, then the image 
of gis a subgroup of G’. 


We singled out certain homomorphisms and called them monomor- 
phisms. Their property was that they were 1-1. We want to measure how far 
a given homomorphism is from being a monomorphism. This prompts the 


Definition. If g is a homomorphism of G inté G’, then the kernel of 
y, Ker g, is defined by Ker g = {a € G| g(a) = e’}. 


Kerg measures the lack of /-1’ ness at one point e’. We claim that 
this lack is rather uniform. What is W = {x € G| (x) = w’} for a given 
w’ € G’'? We show that if g(x) = w’ for some x € G, then W = 
{kx|k © Ker y} = (Ker ¢)x. Clearly, if k € Ker g and g(x) = w’, then 
p(kx) = g(k)p(x) = e'g(x) = w’, so kx € W. Also, if g(x) = g(y) = w’, 
then e(x) = @(y), hence g(y)p(x)~! = e'; but g(x)! = g(x~!) by 
Lemma 2.5.2, so e’ = o(y)p(x)' = e(y)e(x"') = e(yx™'), whence 
yx” ' © Ker gandsoy € rid y)x. Thus the inverse image of any element 
w’' in ~(G) € G' is the set (Ker ¢)x, where x is any element in G such that 
g(x) =w'. 

We state this as 


ra 


Lemma 2.5.4. If w' © G’' is of the form g(x) = wi’, then 
{y€ G|e(y) = w'} = (Ker g)x. 
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We now shall study some basic properties of the kernels of homomor- 
phisms. 


Theorem 2.5.5. If gis a homomorphism of G into G’, then 
(a) Ker gis a subgroup of G. 
(b) Given a € G,a™' (Ker g)a C Ker @. 


Proof. Although this is so important, its proof is easy. If a, b € Ker g, 
then g(a) = g(b) = e’, hence g(ab) = y(a)¢g(b) = e', whence ab € Ker g, 
so Ker g is closed under product. Also g(a) = e' implies that g(a ') = 
g(a) ' = e’, and so a | € Ker g. Therefore, Ker ¢ is a subgroup of G. If 
k © Kerg and a € G, then ¢g(k) = e’. Consequently, g(a 'ka) = 
gy (a) (k)p(a) = e(a')e'p(a) = g(a ')p(a) = g(a 'a) = g(e) = e’. This 
tells us that a 'ka € Ker g, hence a | (Ker g) a € Ker g. The theorem is 
now completely proved. L] 


Corollary. If g is a homomorphism of G into G’, then @¢ is a 
monomorphism if and only if Ker g = (e). 


Proof. This result is really a corollary to Lemma 2.5.4. We leave the 
few details to the reader. 1] 


Property (b) of Ker g in Theorem 2.5.5 is an interesting and basic one 
for a subgroup to enjoy. We ran into this property in the text material and 
problems earlier on several occasions. We use it to define the ultra-important 
class of subgroups of a group. 


Definition. The subgroup N of G is said to be a normal subgroup of 
G if a~'Na C N for every a € G. 


Of course, Ker g, for any homomorphism, is a normal subgroup of G. 
As we shall see in the next section, every normal subgroup of G is the kernel 
of some appropriate homomorphism of G into an appropriate group G’. So in 
a certain sense the notions of homomorphism and normal subgroups will be 
shown to be equivalent. 

Although we defined a normal subgroup via a” 'Na C N, we actually 
have a 'Na = N. For if a 'Na C N for all a € G, then N = a(a’'Na)a™' C 
aNa ' = (a—')"'Na~' C N. So N = aNa‘' for every a € G. Transposing, 
we have Na = aN;; that is, every left coset of N in G is a right coset of 
Nin G. 

On the other hand, if every left coset of N in Gis a right coset, then the 
left coset aN, which contains a, must be equal to the right coset containing a, 
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namely Na. Thus, aN = Na and N = a“'Na for all a € G, which is to say that 
N is normal in G. 

We write “N is a normal subgroup of G” by the abbreviated symbol 
NAG. 

Note that a~'Na = N does not mean that a" 'na = n for every n € N. 
No—merely that the set of all a” ‘na is the same as the set of all n. 

We have proved 


Theorem 2.5.6. N <J G if and only if every left coset of N in G is a 
right coset of N in G. 


Before going any further, we pause to look at some examples of kernels 
of homomorphisms and normal subgroups. 

If G is abelian, then every subgroup of G is normal, for a” 'xa = x for 
every a, x € G. The converse of this is not true. Nonabelian groups exist in 
which every subgroup is normal. See if you can find such an example of order 
8. Such nonabelian groups are called Hamiltonian, after the Irish mathemati- 
cian W. R. Hamilton (1805-1865). The desired group of order 8 can be found 
in the quaternions of Hamilton, which we introduce in Chapter 4, Section 1. 

In Example 1, g(x) = log,)x, and Ker g = {x |log,yx = 0} = {1}. In 
Example 2, where G is abelian, and g(x) = x’, 


Ker o= {x € G|x* = e} 


The kernel of the trivial homomorphism of Example 3 is all of G. In Exam- 
ple 4, Kerg is the set of all even integers. In Example 5, Kerg = 
{a € C’| |a| = 1}, which can be identified, from the polar form of a complex 
number, as Ker g = {cosx + i sinx|x real}. In Example 6, Kerg = 
{T,» € G|b real}. In Example 7, Ker ¢ is the set of all multiples of n. In Ex- 
amples 8 and 9, the kernels consists of e alone, for the maps are monomor- 
phisms. In Example 10, we see that Ker g = {277m | m any integer}. 

Of course, all the kernels above are normal subgroups of their respec- 
tive groups. We should look at some normal subgroups, intrinsically in G it- 
self, without recourse to the kernels of homomorphism. We go back to the 
examples of Section 1. 


1. In Example 7, H = {T,,, © G|a rational}. If T,, © G, we leave it to the 
reader to check that T,}, HT,,, C Hand so HG. 

2. In Example 9 the subgroup {i, g, g’, g°} <1 G. Here too we leave the 
checking to the reader. 

3. In Example 10 the subgroup H = {i, h, h’,..., h"”'} is normal in G. This 
we also leave to the reader. 
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4. If G is any group, Z(G), the center of G, is a normal subgroup of G (see 
Example 11 of Section 3). 

5. If G = S;, G has the elements i, f, g, g’, fg, and gf, where f(x,) = x», 
f(%2) = 1, f(%3) = x3 and g(x) = x2, B(X2) = X3, B(X3) = x1. We claim 
that the subgroup N = {i, g, g’} <1 S;. As we saw earlier (or can compute 
ROW), 8) = 2: = Bey = Bee) ee) eee. fF Sieh = 
g’, and so on. So N <1 S; follows. 


The material in this section has been a rather rich diet. It may not seem 
so, but the ideas presented, although simple, are quite subtle. We recom- 
mend that the reader digest the concepts and results thoroughly before going 
on. One way of seeing how complete this digestion is, is to take a stab at 
many of the almost infinite list of problems that follow. The material of the 
next section is even a richer diet, and even harder to digest. Avoid a mathe- 
matical stomachache later by assimilating this section well. 


PROBLEMS 


Easier Problems 


1. Determine in each of the parts if the given mapping is a homomorphism. 
If so, identify its kernel and whether or not the mapping is 1-1 or onto. 
(a) G = Zunder +, G' = Z,, (a) = [a] fora € Z. 

(b) G group, 9: G > G defined by g(a) = a“! fora € G. 

(c) G abelian group, 9: G > G defined by g(a) = a“! fora € G. 

(d) G group of all nonzero real numbers under multiplication, G’ = 
{1, -1}, g(r) = 1 if ris positive, g(r) = —1 if r is negative. 

(e) G an abelian group, n > 1 a fixed integer, and gy: G — G defined by 
y(a) = a" fora€e G. 

2. Recall that G > G’ means that G is isomorphic to G’. Prove that for all 

groups G,, G2, G3: 

(a) G, = G,. 

(b) G; = G, implies that G, = G,. 

(c) G, = G,, G, = G; implies that G; = G3. 

3. Let G be any group and A(G) the set of all 1-1 mappings of G, as a set, 
onto itself. Define L,: G—> G by L,(x) = xa“ '. Prove that: 

(a) L, € A(G). 

(b) LiL, = Lap. 

(c) The mapping &¥: G — A(G) defined by (a) = L, is a monomor- 
phism of G into A(G). 
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. In Problem 3 prove that for all a, b € G, T,L, = L,T,, where T, is de- 


fined as in Example 8. 


. In Problem 4, show that if V © A(G) is such that T,V = VT, for all 


a € G, then V = L, for some b € G. (Hint: Acting on e € G, find out 
what b should be.) 


. Prove that if ¢: G— G’' is a homomorphism, then ¢ (G), the image of G, 


is a subgroup of G’. 


. Show that ¢: G — G’, where ¢g 1s a homomorphism, is a monomorphism 


if and only if Ker g = (e). 


. Find an isomorphism of G, the group of all real numbers under +, onto 


G’, the group of all positive real numbers under multiplication. 


. Verify that if G is the group in Example 6 of Section 1, and H = 


{T, » € G|a rational}, then H <1] G, the dihedral group of order 8. 


. Verify that in Example 9 of Section 1, the set H = {i, g, g’, g°} is a nor- 


mal subgroup of G, the dihedral group of order 8. 


. Verify that in Example 10 of Section 1, the subgroup 


H={i,h,h’,...,n™- 4 


is normal in G. 


. Prove that if Z(G) is the center of G, then Z(G) IG. 
. If G is a finite abelian group of order n and gy: G — G 1s defined by 


g(a) = a”™ for all a € G, find the necessary and sufficient condition that 
g be an isomorphism of G onto itself. 


. If Gis abelian and g: G — G’ is a homomorphism of G onto G’, prove 


that G’ is abelian. 


. If G is any group, N <J G, and g¢: G ~ G' a homomorphism of G onto 


G', prove that the image, g(N), of N is a normal subgroup of G’. 


. If N<1G and M <1G and MN = {mn|m € M,n € N}, prove that MN is 


a subgroup of G and that MN IG. 


. If MIG,N<AIG, prove that MN NAG. 
. If His any subgroup of G and N = M,¢, a ‘Ha, prove that N IG. 
. If H is a subgroup of G, let N(H) be defined by the relation N(H) = 


{a € G|a"'Ha = H}. Prove that: 

(a) N(#) is a subgroup of G and N(AZ) D H. 

(b) HA N(A). 

(c) If K is a subgroup of G such that H <1 K, then K C N(#2). [So N(AZ) 
is the largest subgroup of G in which H 1s normal.| 


If M<1G,N<1G, and MN N = (e), show that for m € M,n € N, mn = nm. 
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21. Let S be any set having more than two elements and A(S) the set of 
all 1-1 mappings of S onto itself. If s © S, we define H(s) = 
{ f € A(S) | f(s) = s}. Prove that H(s) cannot be a normal subgroup of A(S). 


22. Let G = S;, the symmetric group of degree 3 and let H = {i, f}, where 
f(X1) = X2,f(%2) = x1, f(%3) = x3. 

(a) Write down all the left cosets of H in G. 
(b) Write down all the right cosets of H in G. 
(c) Is every left coset of H a right coset of H? 

23. Let G be a group such that all subgroups of G are normal in G. If 
a, b € G, prove that ba = a/b for some j. 

24. If G,, G, are two groups, let G = G, X Gy), the Cartesian product of G,, 
G, [i.e., G is the set of all ordered pairs (a, b) where a € G,, b € Gy]. 
Define a product in G by (a,, b;)(a, by) = (a,a,, b,b>). 

(a) Prove that G is a group. 

(b) Show that there is a monomorphism g, of G, into G such that 
y,(G,) <I G, given by ¢,(a,) = (a, e), where e, is the identity ele- 
ment of G,. 

(c) Find the similar monomorphism g, of G, into G. 

(d) Using the mappings 9), g, of Parts (b) and (c), prove that 
£1(G,)¢2(G,) = Gand ~,(G,) MN ~2(G,) is the identity element of G. 

(e) Prove that G, X G, = G, X G,. 

25. Let G be a group and let W = G X Gas defined in Problem 24. Prove that: 
(a) The mapping g: G — W defined by ¢(a) = (a, a) is amonomorphism 

of G into W. 

(b) The image ¢(G) in W [i.e., {(a, a) | a € G}] is normal in W if and only 

if G is abelian. 


Middle-Level Problems 


*26. If G is a group and a € G, define o,: G > G by a,(g) = aga '. We saw 
in Example 9 of this section that o, is an isomorphism of G onto itself, so 
a, © A(G), the group of all 1-1 mappings of G (as a set) onto itself. De- 
fine yy: G — A(G) by (a) = ag, for all a € G. Prove that: 
(a) wis a homomorphism of G into A(G). 
(b) Ker & = Z(G), the center of G. 

27. If is an automorphism of G and N <1 G, prove that 6(N) IG. 

28. Let 0, yw be automorphisms of G, and let 6% be the product of 6 and was 
mappings on G. Prove that 6 is an automorphism of G, and that 6”' is 
an automorphism of G, so that the set of all automorphisms of G is itself 
a group. 
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*29. A subgroup 7 of a group W is called characteristic if ¢(T) C T for all au- 
tomorphisms, ¢g, of W. Prove that: 

(a) M characteristic in G implies that MG. 

(b) M, N characteristic in G implies that MN is characteristic in G. 

(c) A normal subgroup of a group need not be characteristic. (This is 
quite hard; you must find an example of a group G and a noncharac- 
teristic normal subgroup.) 

30. Suppose that |G| = pm, where p|m and p is a prime. If H is a normal 
subgroup of order p in G, prove that H is characteristic. 

31. Suppose that G is an abelian group of order p"m where p| m is a prime. 
If H is a subgroup of G of order p”, prove that H is a characteristic sub- 
group of G. 

32. Do Problem 31 even if G is not abelian if you happen to know that for 
some reason or other H IG. 

33. Suppose that N <( G and M C Mis acharacteristic subgroup of N. Prove 
that M <1 G. (It is not true that if M <1] N and N <JG, then M must be 
normal in G. See Problem 50.) 

34. Let G be a group, 4(G) the group of all automorphisms of G. (See Prob- 
lem 28.) Let J(G) = {a,| a € G}, where a, is as defined in Problem 26. 
Prove that /(G) < A(G). 

35. Show that Z(G), the center of G, is a characteristic subgroup of G. 

36. If N <1 G and His a subgroup of G, show that HN N <1 H. 


Harder Problems 


37. If G is a nonabelian group of order 6, prove that G = S3. 

38. Let G be a group and H a subgroup of G. Let S = {Ha|a € G} be the 
set of all right cosets of H in G. Define, for b € G, T,: S — S by T;,(Ha) 
= Hab". 

(a) Prove that 7,7. = T,, for all b, c © G [therefore the mapping 
yw: G— A(S) defined by w(b) = T, is a homomorphism]. 

(b) Describe Ker y, the kernel of »: G — A(S). 

(c) Show that Ker yw is the largest normal subgroup of G lying in H 
[largest in the sense that if N < Gand N C H, then N C Ker yj. 

39. Use the result of Problem 38 to redo Problem 37. 

Recall that if H is a subgroup of G, then the index of H in G, ig(/), is 
the number of distinct right cosets of H and G (if this number is finite). 

40. If G is a finite group, H a subgroup of G such that n]i,(H)! where n = 
|G|, prove that there is a normal subgroup N # (e) of G contained in H. 
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41. Suppose that you know that a group G of order 21 contains an element a 
of order 7. Prove that A = (a), the subgroup generated by a, is normal in 
G. (Hint: Use the result of Problem 40.) 

42. Suppose that you know that a group G of order 36 has a subgroup H or 
order 9. Prove that either H <I G or there exists a subgroup N < G, 
NC H,and|N| =3. 

43. Prove that a group of order 9 must be abelian. 

44, Prove that a group of order p*, p a prime, has a normal subgroup of 
order p. 

45. Using the result of Problem 44, prove that a group of order p’, p a 
prime, must be abelian. 

46. Let G be a group of order 15; show that there is an element a # ein G 
such that a° = e and an element b # e such that b° = e. 

47. In Problem 46, show that both subgroups A = {e, a, a*} and B = 
{e, b, b*, b?, b*} are normal in G. 


48. From the result of Problem 47, show that any group of order 15 is cyclic. 
Very Hard Problems 


49. Let G be a group, H a subgroup of G such that i, (#7) is finite. Prove that 
there is a subgroup N C H, N <1 Gsuch that i, (N) is finite. 


50. Construct a group G such that G has a normal subgroup N, and N has a 
normal subgroup M (i.e., N< G, M<1N), yet M is not normal in G. 

51. Let G be a finite group, g an automorphism of G such that ¢” is the iden- 
tity automorphism of G. Suppose that g(x) = x implies that x = e. Prove 
that G is abelian and y(a) = a“! for alla € G. 


52. Let G be a finite group and ¢ an automorphism of G such that g(x) = 
x~' for more than three-fourths of the elements of G. Prove that ¢(y) = 
y_' for all y € G, and so G is abelian. 


6. FACTOR GROUPS 


Let G be a group and N a normal subgroup of G. In proving Lagrange’s The- 
orem we used, for an arbitrary subgroup H, the equivalence relation a ~ b if 
ab~' € H. Let’s try this out when N is normal and see if we can say a little 
more than one could say for just any old subgroup. 

So, let a ~ b if ab-' © N and let [a] = {x € G|x ~ a}. As we saw 
earlier, [a] = Na, the right coset of N in G containing a. Recall that in 
looking at Z,, we defined for it an operation + via [a] + [b] = [a + b]. Why 
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not try something similar for an arbitrary group G and a normal subgroup 
N of G? 

So let M = {[a] | a € G}, where [a] = {x € G| xa~' E N} = Na. We de- 
fine a product in M via [a][b] = [ab]. We shall soon show that M is a group 
under this product. But first and foremost we must show that this product in 
M is well-defined. In other words, we must show that if [a] = [a’'| and [b] = 
[b'], then [ab] = [a’b’], for this would show that [a][b] = [ab] = [a'b'] = 
[a’][b']; equivalently, that this product of classes does not depend on the par- 
ticular representatives we use for the classes. 

Therefore let us suppose that [a] = [a'] and [b] = [b']. From the defi- 
nition of our equivalence we have that a’ = na, where n € N. Similarly, 
b' = mb, where m € N. Thus a'b' = namb = n(ama“')ab; since N <I G, 
ama™' is in N, so n(ama“') is also in N. So if we let n, = n(ama™'), then 
n, © WN and a'b' = n,ab. But this tells us that a'b’ © Nab, so that 
a'b' ~ ab, from which we have that [a’b'] = [ab], the exact thing we re- 
quired to ensure that our product in M was well-defined. 

Thus M is now endowed with a well-defined product [a][b] = [ab]. 
We now verify the group axioms for M. Closure we have from the very def- 
inition of this product. If [a], [b], and [c] are in M, then [a]([b][c]) = 
[a][bc] = [a(bc)] = [(ab)c] (since the product in G is associative) = 
[ab|[c] = ([{a][b])[c]. Therefore, the associative law has been established 
for the product in M. What about a unit element? Why not try the obvious 
choice, namely [e]? We immediately see that [a]f[e] = [ae] = [a] and 
[e][a] = [ea] = [a], so [e] does act as the unit element for M. Finally, what 
about inverses? Here, too, the obvious choice is the correct one. If a € G, 
then [a][a~'] = [aa~'] = [e], hence [a™ '] acts as the inverse of [a] relative 
to the product we have defined in M. 

We want to give M a name, and better still, a symbol that indicates its 
dependence on G and N. The symbol we use for M is G/N (read “G over N or 
G mod N”) and G/N is called the factor group or quotient group of G by N. 

What we have shown 1s the very important 


Theorem 2.6.1. If N<J G and 
GIN = {{a]|a& G} = {Na|aE G}, 


then G/N is a group relative to the operation [a][b] = [ab]. 
One observation must immediately be made, namely 


Theorem 2.6.2. If N <J G, then there is a homomorphism y¢ of G onto 
G/N such that Ker y, the kernel of y, is N. 
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Proof. The most natural mapping from G to G/N is the one that 
does the trick. Define y: G — G/N by w (a) = [a]. Our product as defined 
in G/N makes of & a homomorphism, for w(ab) = [ab] = [a][b] = 
w (a) (b). Since every element X € G/N is of the form X = [b] = w(b) for 
some b € G, wis onto. Finally, what is the kernel, Ker wW, of w? By defini- 
tion, Ker » = {a € G| # (a) = E}, where E is the unit element of G/N. But 
what is E? Nothing other than E = [e] = Ne = N, anda € Ker w if and 
only if E = N = (a) = Na. But Na = N tells us that a = ea € Na = N, so 
we see that Ker » C N. That N C Ker w—which is easy—we leave to the 
reader. So Ker » = N.C 


Theorem 2.6.2 substantiates the remark we made in the preceding sec- 
tion that every normal subgroup N of G is the kernel of some homomor- 
phism of G onto some group. The “some homomorphism” is the w defined 
above and the “some group” is G/N. 

This construction of the factor group G by N is possibly the single most 
important construction in group theory. In other algebraic systems we shall 
have analogous constructions, as we shall see later. 

One might ask: Where in this whole affair did the normality of N in G 
enter? Why not do the same thing for any subgroup H of G? So let’s try and 
see what happens. As before, we define 


W = {[a]|a © G} = {(HalaeE G} 


where the equivalence a ~ b is defined by ab~' € H. We try to introduce a 
product in W as we did for G/N by defining [a][b] = [ab]. Is this product well 
defined? If h € H, then [hb] = [b], so for the product to be well defined, we 
would need that [a][b] = [a][hb], that is, [ab] = [ahb]. This gives us that Hab 
= Hahb, and so Ha = Hah; this implies that H = Haha™', whence aha"! € H. 
That is, for all a € G and all h € H, aha™' must be in H; in other words, H 
must be normal in G. So we see that in order for the product defined in W to 
be well-defined, H must be a normal subgroup of G. 

We view this matter of the quotient group in a slightly different way. If 
A, B are subsets of G, let AB = {ab|a € A, b € B}. If H is a subgroup of G, 
then HH C H'1s another way of saying that H is closed under the product 
of G. 

Let G/N = {Na|a € G} be the set of all right cosets of the normal sub- 
group N in G. Using the product of subsets of G as defined above, what is 
(Na)(Nb)? By definition, (Na)(Nb) consists of all elements of the form 
(na)(mb), where n, m € N, and so 


(na)(mb) = (nama~*')(ab) = n,ab, 
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where n, = nama‘ ' is in N, since N is normal. Thus (Na)(Nb) C Nab. On the 
other hand, if n € N, then 


n(ab) = (na)(eb) € (Na)(Nb), 


so that Nab C (Na)(Nb). In short, we have shown that the product—as sub- 
sets of G—of Na and Nb is given by the formula (Na)(Nb) = Nab. All the 
other group axioms for G/N, as defined here, are now readily verified from 
this product formula. 

Another way of seeing that (Na)(Nb) = Nab is to note that by the nor- 
mality of N, aN = Na, hence (Na)(Nb) = N(aN)b = N(Na)b = NNab = 
Nab, since NN = N (because N is a subgroup of G). 

However we view G/N—as equivalence classes or as a set of certain 
subsets of G—we do get a group whose structure is intimately tied to that of 
G, via the natural homomorphism yw of G onto G/N. 

We Shall see very soon how we combine induction and the structure of 
G/N to get information about G. 

When G is a finite group and N <I G, then the number of right cosets of 
N in G, ig(N), is given—as the proof of Lagrange’s Theorem showed—by 
ig(n) = |G|/|N|. But this is the order of G/N, which is the set of all the right 
cosets of N in G. Thus |G/N| = |G|/|N|. We state this more formally as 


Theorem 2.6.3. If G is a finite group and N <J G, then |G/N| = 
|G|/|N I. 


As an application of what we have been talking about here, we shall 
prove a special case of a theorem that we shall prove in its full generality 
later. The proof we give—for the abelian case—is not a particularly good 
one, but it illustrates quite clearly a general technique, that of pulling back 
information about G/N to get information about G itself. 


The theorem we are about to prove is due to the great French mathematician 
A. L. Cauchy (1789-1857), whose most basic contributions were in complex 
variable theory. 


Theorem 2.6.4 (Cauchy). If G is a finite abelian group of order |G| 
and p is a prime that divides |G|, then G has an element of order p. 


Proof. Before getting involved with the proof, we point out to the 
reader that the theorem is true for any finite group. We shall prove it in the 
general case later, with a proof that will be much more beautiful than the one 
we are about to give for the special, abelian case. 

We proceed by induction on |G|. What does this mean precisely? We shall 
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assume the theorem to be true for all abelian groups of order less than |G| and 
show that this forces the theorem to be true for G. If |G| = 1, there is no such p 
and the theorem is vacuously true. So we have a starting point for our induction. 

Suppose that there is a subgroup (e) # N # G. Since |N| < |G], if p di- 
vides | N|, by our induction hypothesis there would be an element of order p 
in N, hence in G, and we would be done. So we may suppose that p||N|. 
Since G is abelian, every subgroup is normal, so we can form G/N. Because p 
divides |G| and p{|N|, and because |G/N| = |G|/|N|, we have that p 
divides |G/N|. The group G/N is abelian, since G is (Prove!) and since 
N # (e), |N| > 1, so |G/N| = |G|/|N| < |G]. Thus, again by induction, there 
exists an element in G/N of order p. In other words, there exists an a € G 
such that [a]? = [e], but [a] # [e]. This translates to a? € N,a € N. So if 
m = |N\, then (a?)” = e. So (a”)? = e. If we could show that b = a” # e, 
then b would be the required element of order p in G. But if a” = e, then 
[a]” = [e], and since [a] has order p, p | m (see Problem 31 of Section 4). But, 
by assumption, p/m = |N|. So we are done if G has a nontrivial subgroup. 

But if G has no nontrivial subgroups, it must be cyclic of prime order. 
(See Problem 16 of Section 3, which you should be able to handle more eas- 
ily now.) What is this “prime order”? Because p divides |G|, we must have 
|G| = p. But then any element a # e € G satisfies a”? = e and is of order p. 
This completes the induction, and so proves the theorem. [| 


We shall have other applications of this kind of group-theoretic argu- 
ment in the problems. 

The notion of a factor group is a very subtle one, and of the greatest 
importance in the subject. The formation of a new set from an old one by 
using as elements of this new set subsets of the old one is strange to the neo- 
phyte seeing this kind of construction for the first time. So it is worthwhile 
looking at this whole matter from a variety of points of view. We consider 
G/N from another angle now. 

What are we doing when we form G/N? Sure, we are looking at equiva- 
lences classes defined via N. Let’s look at it another way. What we are doing 
is identifying two elements in G if they satisfy the relation ab-' € N. Ina 
sense we are blotting out N. So although G/N is not a subgroup of G, we can 
look at it as G, with N blotted out, and two elements as equal if they are 
equal “up to N.” 

For instance, in forming Z/N, where Z is the group of integers and N is 
the set of all multiples of 5 in Z, what we are doing is identifying 1 with 6, 11, 
16, —4, —9, and so on, and we are identifying all multiples of 5 with 0. The 
nice thing about all this is that this identification jibes with addition in Z 
when we go over to Z/N. 

Let’s look at a few examples from this point of view. 
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1. Let G = {T,,|a # 0, b real} (Example 6 of Section 1). Let N = 
{T, |b real} C G; we saw that N <J G, so it makes sense to talk about G/N. 
Now 7, , and T, 9 are in the same left coset of N in G, so in G/N we are get- 
ting an element by identifying T, , with 7, 9. The latter element just depends 
on a. Moreover, the T, , multiply according to T, ,T. 4g = Tac.aqa+» and if we 
identify T, , with T, 9, T..4 with 7, 9, then their product, which is 7,. ga+5, 1S 
identified with T,,.9. So in G/N multiplication is like that of the group of 
nonzero real numbers under multiplication, and in some sense (which will be 
made more precise in the next section) G/N can be identified with this group 
of real numbers. 


2. Let G be the group of real numbers under + and let Z be the group of in- 
tegers under +. Since G is abelian, Z <] G, and so we can talk about G/Z. 
What does G/Z really look like? In forming G/Z, we are identifying any two 
real numbers that differ by an integer. So 0 is identified with —1, —2, —3,... 
and 1, 2, 3, ...; $ is identified with 5, 3, —3, —3,.... Every real number a 
thus has a mate, @, where 0 = @ < 1. So, in G/Z, the whole real line has been 
compressed into the unit interval [0, 1]. But a little more is true, for we have 
also identified the end points of this unit interval. So we are bending the unit 
interval around so that its two end points touch and become one. What do 
we get this way? A circle, of course! So G/Z is like a circle, in a sense that 
can be made precise, and this circle is a group with an appropriate product. 


3. Let G be the group of nonzero complex numbers and let N = 
{a € G| |a| = 1} which is the unit circle in the complex plane. Then N is a 
subgroup of G and is normal since G is abelian. In going to G/N we are de- 
claring that any complex number of absolute value 1 will be identified with the 
real number 1. Now any a € G, in its polar form, can be written as a = 
r(cos 6 + i sin @), where r = |a|, and|cos 6 + i sin 6| = 1. In identifying 
cos 6 + i sin @ with 1, we are identifying a with r. So in passing to G/N every 
element is being identified with a positive real number, and this identification 
jibes with the products in G and in the group of positive real numbers, since 
lab| = |a||b|. So G/N is in a very real sense (no pun intended) the group of 
positive real numbers under multiplication. 


PROBLEMS 


1. If G is the group of all nonzero real numbers under multiplication and N 
is the subgroup of all positive real numbers, write out G/N by exhibiting 
the cosets of N in G, and construct the multiplication in G/N. 


2. If G is the group of nonzero real numbers under multiplication and 
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10. 


11. 


12. 


13. 


14. 


N = {1, —1}, show how you can “identify” G/N as the group of all 
positive real numbers under multiplication. What are the cosets of 
Nin G? 


. If Gis a group and N <I G, show that if M is a subgroup of G/N and M = 


{a € G| Na € M}, then M is a subgroup of G, and MD N. 


. If M in Problem 3 is normal in G/N, show that the M defined is normal 


in G. 


. In Problem 3, show that M/N must equal M. 
. Arguing as in the Example 2, where we identified G/Z as a circle, where 


G is the group of reals under + and Z integers, consider the following: 
let G = {(a, b)| a, b real}, where + in G is defined by (a, b) + (c, d) = 
(a + c, b + d) (so Gis the plane), and let N = {(a, b) € G| a, b are inte- 
gers}. Show that G/N can be identified as a torus (donut), and so we can 
define a product on the donut so that it becomes a group. Here, you may 
think of a torus as the Cartesian product of two circles. 


. If Gis acyclic group and N is a subgroup of G, show that G/N is a cyclic 


group. 


. If Gis an abelian group and N is a subgroup of G, show that G/N is an 


abelian group. 


. Do Problems 7 and 8 by observing that G/N is a homomorphic image 


of G. 

Let G be an abelian group of order p{'p5? --- px*, where p;, Po, -.- 5 Dx 
are distinct prime numbers. Show that G has subgroups S,, S,,..., S,; of 
orders p{', pS2,..., pz*, respectively. (Hint: Use Cauchy’s Theorem and 


pass to a factor group.) This result, which actually holds for all finite 
groups, is a famous result in group theory known as Sylow’s Theorem. 
We prove it in Section 11. 


If G is a group and Z(G) the center of G, show that if G/Z(G) is cyclic, 
then G is abelian. 


If G is a group and N <j G is such that G/N is abelian, prove that 
aba" 'b™! E N for all a, b € G. 


If G is a group and N <I Gis such that 
aba"'b"'EN 


for all a, b € G, prove that G/N is abelian. 


If G is an abelian group of order p; p,-+-- py, where pj, P2,..., Px are 
distinct primes, prove that G is cyclic. (See Problem 15.) 
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15. If G is an abelian group and if G has an element of order m and one of 
order n, where m and nv are relatively prime, prove that G has an element 
of order mn. 

16. Let G be an abelian group of order p"m, where p is a prime and p{ m. 
Let P = {a € G|a?* = e for some k depending on a}. Prove that: 

(a) P is a subgroup of G. 
(b) G/P has no elements of order p. 
(c) |P| = p”. 

17. Let G be an abelian group of order mn, where m and n are relatively 

prime. Let M = {a € G| a™ = e}. Prove that: 

(a) M is a subgroup of G. 

(b) G/M has no element, x, other than the identity element, such that 
x” = unit element of G/M. 


18. Let G be an abelian group (possibly infinite) and let the set T = 
{a € G| a” = e,m> 1 depending on a}. Prove that: 
(a) Tis a subgroup of G. 
(b) G/T has no element—other than its identity element—of finite order. 


7. THE HOMOMORPHISM THEOREMS 


Let G be a group and ga homomorphism of G onto G’. If K 1s the kernel of 
gy, then K is anormal subgroup of G, hence we can form G/K. It is fairly nat- 
ural to expect that there should be a very close relationship between G' and 
G/K. The First Homomorphism Theorem, which we are about to prove, 
spells out this relationship in exact detail. 

But first let’s look back at some of the examples of factor groups in 
Section 6 to see explicitly what the relationship mentioned above might be. 


1. Let G = {T,,|a # 0, b real} and let G’ be the group of nonzero reals 
under multiplication. From the product rule of these 7’s, namely 
Ta,ol ca = Tac.ad+b, We determined that the mapping g: G — G’ defined 
by ¢(7,,.,) = a is a homomorphism of G onto G’ with kernel K = 
{T,,,|b real}. On the other hand, in Example 1 of Section 6 we saw that 
G/K = {KT, )| a # 0 real}. Since 


(KT, 0) (KT, 0) ee KT, 0) 


the mapping of G/K onto G’, which sends each KT, , onto a, is readily 
seen to be an isomorphism of G/K onto G’. Therefore, G/K = G’. 


2. In Example 3, G was the group of nonzero complex numbers under multipli- 
cation and G’ the group of all positive real numbers under multiplication. 
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Let ¢:G — G’ defined by g(a) = |a| for a € G. Then, since |ab| = 
|a| |b|, g is a homomorphism of G onto G’ (can you see why it is onto?). 
Thus the kernel K of ¢ is precisely K = {a € G | |a| = 1}. But we have al- 
ready seen that if |a| = 1, then a is of the form cos 6 + i sin 6. So the set 
K = {cos 6 + i sin 6|0 S 6 < 27}. If a is any complex number, then 
a = r(cos 6 + i sin 0), where r = |al, is the polar form of a. Thus Ka = 
Kr(cos 6 + isin 0) = K(cos 6 + isin 6)r = Kr, since K(cos 6 + i siné) = K 
because cos 6 + i sin 6 © K. So G/K, whose elements are the cosets Ka, 
from this discussion, has all its elements of the form Kr, where r > 0. The 
mapping of G/K onto G’ defined by sending Kr onto r then defines an iso- 
morphism of G/K onto G’. So, here, too, G/K = G’. 


With this little experience behind us we are ready to make the jump the 
whole way, namely, to 


Theorem 2.7.1 (First Homomorphism Theorem). Let g be a homo- 
morphism of G onto G' with kernel K. Then G’ = G/K, the isomorphism be- 
tween these being effected by the map 


Ww: G/K > G’ 


defined by (Ka) = g(a). 


Proof. The best way to show that G/K and G’ are isomorphic 1s to ex- 
hibit explicitly an isomorphism of G/K onto G’. The statement of the theo- 
rem suggests what such an isomorphism might be. 

So define w: G/K — G' by (Ka) = ¢(a) for a € G. As usual, our first 
task is to show that w is well defined, that is, to show that if Ka = Kb, then 
(Ka) = (Kb). This boils down to showing that if Ka = Kb, then g(a) = 
p(b). But if Ka = Kb, then a = kb for some k € K, hence g(a) = g(kb) = 
p(k) p(b). Since k € K, the kernel of ¢g, then ¢(k) = e’, the identity element 
of G', so we get g(a) = y(b). This shows that the mapping y is well defined. 

Because ¢ is onto G’, given x € G’, then x = g(a) for some a € G, 
thus x = g(a) = (Ka). This shows that # maps G/K onto G’. 

Is w 1-1? Suppose that w(Ka) = w(Kb); then g(a) = (Ka) = 
w(Kb) = o(b). Therefore, e' = g(a)¢(b)' = ¢(a)e(b"') = ¢(ab“'). Be- 
cause ab”! is thus in the kernel of e—which is K—we have ab™! € K. This 
implies that Ka = Kb. In this way wis seen to be 1-1. 

Finally, is % a homomorphism? We check: w((Ka)(Kb)) = W(Kab) = 
p(ab) = g(a)gy(b) = &(Ka)(Kb), using that g is a homomorphism and 
that (Ka)(Kb) = Kab. Consequently, & is a homomorphism of G/K onto G’, 
and Theorem 2.7.1 is proved. (] 
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Having talked about the Firsts Homomorphism Theorem suggests that 
there are others. The next result, however, is an extension of the First 
Homomorphism Theorem, and is traditionally called the Correspondence 
Theorem. In the context of the theorem above, it exhibits a 1-1 correspon- 
dence between subgroups of G' and those subgroups of G that contain K. 


Theorem 2.7.2 (Correspondence Theorem). Let the map g: G — G’' be 
a homomorphism of G onto G’ with kernel K. If H’ is a subgroup of G’ and if 


H = {a€ G| ¢(a) € A’, 


then H is a subgroup of G, H D K, and H/K = H’. Finally, if H' <1 G‘%, then 
HAG. 


Proof. We first verify that the H above is a subgroup of G. It is not 
empty, since e € H. If a, b € H, then g(a), g(b) € A’, hence g(ab) = 
p(a)y(b) © A’, since H' is a subgroup of G’; this puts ab in H, so H is 
closed. Further, if a € H, then g(a) € H', hence g(a~') = g(a)! is in H’, 
again since H’ is a subgroup of G’, whence a' € H. Therefore, H is a sub- 
group of G. 

Because ¢(K) = {e'} C A’, where e’ is the unit element of G’, we have 
that K C H. Since K < G and K C H, it follows that K <1 H. The mapping ¢ 
restricted to H defines a homomorphism of H onto H’ with kernel K. By the 
First Homomorphism Theorem we get H/K = H'. 

Finally, if H’ <J G’ and if a € G, then g(a)'H'g(a) C H’, so 
g(a~')H' g(a) C H'. This tells us that g(a” 'Ha) C H', so a 'Ha C H. This 
proves the normality of Hin G. 0 


It is worth noting that if K is any normal subgroup of G, and ¢ is the 
natural homomorphism of G onto G/K, then the theorem gives us a 1-1 cor- 
respondence between all subgroups H’ of G/K and those subgroups of G 
that contain K. Moreover, this correspondence preserves normality in the 
sense that H’ is normal in G/K if and only if H is normal in G. (See Problem 
7, as well as the last conclusion of the theorem.) 

We now state the Second Homomorphism Theorem, leaving its proof 
to the reader in Problem 5. 


Theorem 2.7.3 (Second Homomorphism Theorem). Let H be a sub- 
group of a group G and N a normal subgroup of G. Then HN = 
{hn|h € H,n € N} is a subgroup of G, HN N is a normal subgroup of H, 
and H/(H NN) = (HN)/N. 
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Finally, we go on to the Third Homomorphism Theorem, which tells us 
a little more about the relationship between N and N’ when N’ JG’. 


Theorem 2.7.4 (Third Homomorphism Theorem). If the map 
gy: G— G’ isa homomorphism of G onto G’ with kernel K then, if N’ dG’ 
and N = {a € G| g(a) € N’}, we conclude that G/N = G'/N’. Equivalently, 
G/N = (G/K)/(N/K). 


Proof. Define the mapping w: G — G'/N' by (a) = N'¢(a) for every 
a € G. Since ¢ is onto G’ and every element of G’/N’ is a coset of the form 
N’'x', and x’ = g(x) for some x € G, we see that & maps G onto G’/N'’. 

Furthermore, yw is a homomorphism of G onto G'/N’, for w(ab) = 
N'ep(ab) = N'e(a)e(b) = (N'e(a))(N'e(5)) = ¥(a)p(d), since N’ <I G’. 
What is the kernel, M, of ? If a € M, then (a) is the unit element of 
G'/N', that is, W(a) = N’. On the other hand, by the definition of , (a) = 
N'¢(a). Because N'—(a) = N' we must have g(a) € N’; but this puts a in N, 
by the very definition of N. Thus M C N. That N C M'1s easy and is left to 
the reader. Therefore, M = N, so & is a homomorphism of G onto G’/N’ 
with kernel N, whence, by the First Homomorphism Theorem, G/N = G'/N’. 

Finally, again by Theorems 2.7.1 and 2.7.2, G' = G/K, N’' = N/K, which 
leads us to G/N = G'/N' = (G/K)/(N/K). 


This last equality is highly suggestive; we are sort of “canceling out” the 
K in the numerator and denominator. 


PROBLEMS 


1. Show that MD Nin the proof of Theorem 2.7.3. 

2. Let G be the group of all real-valued functions on the unit interval [0, 1], 
where we define, for f, g € G, addition by (f + g)(x) = f(x) + g(x) for 
every x € [0, 1]. If N = {f € G| f&@) = 0}, prove that G/N = real num- 
bers under +. 

3. Let G be the group of nonzero real numbers under multiplication and let 
N = {1, —1}. Prove that G/N = positive real numbers under multiplication. 

4. If G,, G, are two groups and G = G, X G, = {(a, b)|a € G,, b € G3}, 
where we define (a, b)(c, d) = (ac, bd), show that: 

(a) N = {(a, e,)|a € G,}, where e, is the unit element of G,, is a normal 
subgroup of G. 

(b) N= Gy. 

(c) GIN=G). 
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5. Let G be a group, H a subgroup of G, and N <j G. Let the set HN = 

{hn |h € H,n € N}. Prove that: 
(a) HONE. 
(b) HN is a subgroup of G. 
(c) NC HN and WN < AN. 
(d) (HN)/IN = HI(H NN). 

*6. If G is a group and N <I G, show that if a € G has finite order o(a), then 
Na in G/N has finite order m, where m|o(a). (Prove this by using the 
homomorphism of G onto G/N.) 


7. If gis ahomomorphism of G onto G' and N <I G, show that g(N) 4G’. 


8. CAUCHY’S THEOREM 


In Theorem 2.6.4—Cauchy’s Theorem—we proved that if a prime p divides 
the order of a finite abelian group G, then G contains an element of order p. 
We did point out there that Cauchy’s Theorem is true even if the group is not 
abelian. We shall give a very neat proof of this here; this proof is due to 
McKay. 

We return for a moment to set theory, doing something that we men- 
tioned in the problems in Section 4. 

Let S be a set, f © A(S), and define a relation on S as follows: s ~ ¢ if 
t = f'(s) for some integer i (i can be positive, negative, or zero). We leave it 
to the reader as a problem that this does indeed define an equivalence rela- 
tion on S. The equivalence class of s, [s], is called the orbit of s under f. So S 
is the disjoint union of the orbits of its elements. 

When f is of order p, p a prime, we can say something about the size of 
the orbits under f; those of the readers who solved Problem 34 of Section 4 
already know the result. We prove it here to put it on the record officially. 

(If f* (s) = s, of course f'*(s) = s for every integer ¢. (Prove!)] 


Lemma 2.8.1. If f © A(S) is of order p, p a prime, then the orbit of 
any element of S under f has 1 or p elements. 


Proof. Let s € S; if f(s) = s, then the orbit of s under f consists merely 
of s itself, so has one element. Suppose then that f(s) # s. Consider the ele- 
ments s, f(s), f’(s), ..., f? ‘(s); we claim that these p elements are distinct 
and constitute the orbit of s under f. If not, then f'(s) = f/(s) for some 
0 <i <j <p — 1, which gives us that f/"'(s) = s. Let m = j — i; then 
0<msp-—1andf”(s) =s. But f?(s) = s and since p{ m, ap + bm = 1 for 
some integers a and b. Thus f'(s) = f77*°"(s) = f7?(f?™(s)) = f7?(s) = 5, 
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since f"(s) = f?(s) = s. This contradicts that f(s) # s. Thus the orbit of s 
under f consists of s, f(s), f7(s),...,f?~ '(s), so as p elements. 1 


We now give McKay’s proof of Cauchy’s Theorem. 


Theorem 2.8.2 (Cauchy). If p is a prime and p divides the order of G, 
then G contains an element of order p. 


Proof. If p = 2, the result amounts to Problem 18 in Section 1. Assume 
that p # 2. Let S be the set of all ordered p-tuples (a,, a),..., a,-, @,), where 
a), a),..., a, are in G and where a,a +: a,_,a, = e. We claim that S has n?~! 
elements where n = |G|. Why? We can choose aj,..., a,-; arbitrarily in G, 
and by putting a, = (a,a,°--a,4)"', the p-tuple (a,, a),..., a,-1, 4,) then satisfies 


Qa,an°°:°:a 


p-14 


= é ae -1 — 
p — @)az°*° A,—1(4442 ay-1) = @, 


so is in S. Thus S has n?~' elements. 

Note that if a,a,---a,_,a, = e, then a,a,a,---a,_, = e (for if xy = e in 
a group, then yx = e). So the mapping f: S — S defined by f(a,,..., a,) = 
(a,,4,,4),...,@,-,) is in A(S). Note that f # i, the identity map on S, and 
that f? = i, so fis of order p. 

If the orbit of s under f has one element, then f(s) = s. On the other 
hand, if f(s) # s, we know that the orbit of s under f consists precisely of p 
distinct elements; this we have by Lemma 2.8.1. Now when is f(s) # s? We 
claim that f(s) # s if and only if when s = (a;, a2, ..., a,), then for some 
i # j, a; # a;. (We leave this to the reader.) So f(s) = s if and only if s = 
(a,a,...,a)forsomea € G. 

Let m be the number of s © S such that f(s) = s; since for s = 
(e,e,...,e), f(s) = s, we know that m = 1. On the other hand, if f(s) # s, 
the orbit of s consists of p elements, and these orbits are disjoint, for they are 
equivalence classes. If there are k such orbits where f(s) # s, we get that 
n?-' = m + kp, for we have accounted this way for every element of S. 

But p | n by assumption and p | (kp). So we must have p | m, since m = 
n?~' — kp. Because m # 0 and p|m, we get that m > 1. But this says that 
there is an s = (a,a,...,a) # (e,e,..., e) in S; from the definition of S this 
implies that a? = e. Since a # e, ais the required element of order p. L] 


Note that the proof tells us that the number of solutions in G of x” = e 
is a positive multiple of p. 

We strongly urge the reader who feels uncomfortable with the proof 
just given to carry out its details for p = 3. In this case the action of fon S 
becomes clear and our assertions about this action can be checked explicitly. 
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Cauchy’s Theorem has many consequences. We shall present one of 
these, in which we determine completely the nature of certain groups of 
order pq, where p and gq are distinct primes. Other consequences will be 
found in the problem set to follow, and in later material on groups. 


Lemma 2.8.3. Let G be a group of order pq, where p, g are primes 
and p > q. If a © Gis of order p and A is the subgroup of G generated by a, 
then A JG. 


Proof. We claim that A is the only subgroup of G of order p. Suppose 
that B is another subgroup of order p. Consider the set AB = 
{xy |x € A, y € B}; we claim that AB has p” distinct elements. For suppose 
that xy = uv where x, u € A, y,v © B; then u ‘x = vy !. But u''x € A, 
vy ' © B, and since u~'x = vy', we have u'x € AN B. Since B # A and 
A 1 Bis a subgroup of A and A is of prime order, we are forced to conclude 
that A N B = (e) and sou ‘x = e, that is, u = x. Similarly, v = y. Thus the 
number of distinct elements in AB is p’. But all these elements are in G, 
which has only pg < p’ elements (since p > q). With this contradiction we 
see that B = A and A is the only subgroup of order p in G. But ifx € G, B = 
x~'Ax is a subgroup of G of order p, in consequence of which we conclude 
that x 'Ax = A: hence AIG. 0 


Corollary. If G, a are as in Lemma 2.8.3 and if x € G, then 
x~'ax = a', where 0 < i < p, for some i (depending on x). 


Proof. Since e #a € A and x ‘Ax = A, x~'ax € A. But every element 
of A is of the form a‘, 0 <i < p, and x ‘ax # e. In consquence, x ‘ax = a’, 
where 0 <i<p.U 


We now prove a result of a different flavor. 


Lemma 2.8.4. If a€ Gis of order m and b € Gis of order n, where 
m and n are relatively prime and ab = ba, then c = ab is of order mn. 


Proof. Suppose that A is the subgroup generated by a and B that gener- 
ated by b. Because |A| = m and |B| = n and (m,n) = 1, we get AN B = (e), 
which follows from Lagrange’s Theorem, for |A M B| |n and|A NM B| | m. 

Suppose that c' = e, where i > 0; thus (ab)' = e. Since ab = ba, e = 
(ab)' = a‘b’; this tells us that a’ = b-'€ AN B = (e). Soa' = e, whence m | i, 
and b' = e, whence n| i. Because (m, n) = 1 and m and n both divide i, mn 
divides i. So i = mn. Since (ab)”" = a™"b”™”" = e, we see that mn is the small- 
est positive integer i such that (ab)' = e. This says that ab is of order mn, as 
claimed in the lemma. C] 
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Before considering the more general case of groups of order pq, let’s 
look at a special case, namely, a group G of order 15. By Cauchy’s Theorem, 
G has elements b of order 3 and a of order 5. By the Corollary to Lemma 
2.8.3, b'!ab = a', where 0 <i < 5. Thus 


b~*ab? = b-'(b-'ab)b = b™'a'b = (b™ ab)! = (a’)' = a” 


. . os ,3 73 at 
and similarly, b-°ab’ = a'*. But b° = e, so we get a’ = a, whence a’! = e. 


Since a is of order 5, 5 must divide i* — 1, that is, i> = 1(5). However, by Fer- 
mat’s Theorem (Corollary to Theorem 2.4.8), i* = 1(5). These two equations 
for i tell us that i = 1(5), so, since 0 < i < 5,i = 1. In short, b ‘ab = a‘ = a, 
which means that ab = ba. Since a is of order 5 and 5 of order 3, by Lemma 
2.8.4, c = ab is of order 15. This means that the 15 powers e = c°,c, c’,..., 
c'4 are distinct, so must sweep out all of G. In a word, G must be cyclic. 

The argument given for 15 could have been made shorter, but the form 
in which we did it is the exact prototype for the proof of the more general 


Theorem 2.8.5. Let G be a group of order pq, where p, q are primes 
and p > q. If qj p — 1, then G must be cyclic. 


Proof. By Cauchy’s Theorem, G has an element a of order p and an 
element b of order g. By the Corollary to Lemma 2.8.3, b- 'ab = a' for some i 
with 0 <i <p. Thus b-"ab’ = a’ for all r = 0 (Prove!), and so b- 4ab? = a’. 
But b’ = e; therefore, a'° = a and so a’! = e. Because a is of order p, we 
conclude that p|i? — 1, which is to say, i? = 1(p). However, by Fermat’s 
Theorem, i?~' = 1(p). Since q/{ p — 1, we conclude that i = 1(p), and since 
0 <i < p,i = 1 follows. Therefore, b-'ab = a‘ = a, hence ab = ba. By 
Lemma 2.8.4, c = ab has order pq, so the powers of c sweep out all of G. 
Thus G is cyclic, and the theorem is proved. L] 


PROBLEMS 
Middle-Level Problems 


1. In the proof of Theorem 2.8.2, show that if some two entries in s = 
(a;, 4, ... , 4,) are different, then f(s) # s, and the orbit of s under f 
has p elements. 

2. Prove that a group of order 35 is cyclic. 


3. Using the result of Problem 40 of Section 5, give another proof of 
Lemma 2.8.3. (Hint: Use for H a subgroup of order p.) 


4. Construct a nonabelian group of order 21. (Hint: Assume that a’ = e, 
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b’ = e and find some i such that a ‘ba = a‘ # a, which is consistent with 
the relations a’ = b’ = e.) 

5. Let G be a group of order p”m, where p is prime and p{ m. Suppose that 
G has a normal subgroup P of order p”. Prove that 6(P) = P for every 
automorphism @ of G. 

6. Let G be a finite group with subgroups A, B such that |A| > V|G| and 
|B| > V|G|. Prove that AN B # (e). 

7. If G is a group with subgroups A, B of orders m, n, respectively, where 
m and n are relatively prime, prove that the subset of G, 
AB = {ab|a € A, b € B}, has mn distinct elements. 

8. Prove that a group of order 99 has a nontrivial normal subgroup. 

9. Prove that a group of order 42 has a nontrivial normal subgroup. 

10. From the result of Problem 9, prove that a group of order 42 has a nor- 
mal subgroup of order 21. 


Harder Problems 


11. If G is a group and A, B finite subgroups of G, prove that the set AB = 
{ab|a € A, b € B} has (|A| |B|)/|A N B| distinct elements. 

12. Prove that any two nonabelian groups of order 21 are isomorphic. (See 
Problem 4.) 


Very Hard Problems 


13. Using the fact that any group of order 9 is abelian, prove that any group 
of order 99 is abelian. 

14, Let p > q be two primes such that g|p — 1. Prove that there exists a 
nonabelian group of order pq. (Hint: Use the result of Problem 40 of 
Section 4, namely that U, is cyclic if p is a prime, and the idea needed to 
do Problem 4 above.) 

15. Prove that if p > g are two primes such that q | p — 1, then any two non- 
abelian groups of order pq are isomorphic. 


9. DIRECT PRODUCTS 


In several of the problems and examples that appeared earlier, we went 
through the following construction: If G,, G, are two groups, then G = 
G, X G, is the set of all ordered pairs (a, b), where a € G, and b € G, and 
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where the product was defined component-wise via (a,, b,)(a2, b.) = 
(a,a,, b,b,), the products in each component being carried out in the respec- 
tive groups G, and G,. We should like to formalize this procedure here. 


Definition. If G,, G,,..., G,, are n groups, then their (external) di- 
rect product G,; X G, X G3; X ::: X G,, is the set of all ordered n-tuples 
(a,,a,,...,a,) where a; € G;, for i = 1,2,...,n, and where the product in 


G, X G, X-:: X G,, is defined component-wise, that is, 
(a,,a>,...,4,)(b,, bo,...,b,) = (a,b;, anb2,...,4a,,b,,). 


That G = G,; X G, X ::: X G, is a group is immediate, with 
(€,, €>,..., €,) aS its unit element, where e; is the unit element of G,, and 
where (a), 4),...,@,) | =(a;',a,',...,a,'). 

G is merely the Cartesian product of the groups G,, G,,..., G, witha 
product defined in G by component-wise multiplication. We call it external, 
since the groups G,, G,,..., G,, are any groups, with no relation necessarily 
holding among them. 

Consider the subsets G,; C G,; X G, X--- X G, = G, where 


G; = {(e1, woe O54, Gj, Cir tye en) | a; E G;}; 


in other words, G;, consists of all n-tuples where in the ith component any el- 
ement of G; can occur and where every other component is the identity ele- 
ment. Clearly, G; is a group and is isomorphic to G;, by the isomorphism 


a,: G; > G; defined by 7;(e,, @.,..., 4;,..., €,) = a;. Furthermore, not 
only is G,; a subgroup of G but G; <J G. (Prove!) 
Given any element a = (a), a2,...,a,) € G, then 
A = (1, Co, .. «5 Cn )(C1, Aa, C35 2025 On) °° * (C4, Cry +0 + 5 Cn—15 An)3 


that is, every a © G can be written as a = a,a,:--4a,, where each a; € G;. 
Moreover, a can be written in this way in a unique manner, that 1s, if a = 


a,a> ae G,, = bb, ea b,, where the a; = G, and b, E G,, then 


a, = bi, ...,a, = b,. So G is built up from certain normal subgroups, the 
G,, as G = G,G,---G,, in such a way that every element a € G has a 
unique representation in the form a = @,a,°--:a,, with a, € G;. 


This motivates the following 


Definition. The group G is said to be the (internal) direct product of 
its normal subgroups N,, N>,..., N, if every a € G has a unique representa- 
tion in the form a = a,a,--::a,, where eacha; € N,; fori =1,2,...,n. 


From what we have discussed above we have the 
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Lemma 2.9.1. If G= G, X G, X-:-: X G, Is the external direct prod- 
uct of G,, G,,..., G,, then G is the internal direct product of the normal 
subgroups G,, G2,..., G,, defined above. 


We want to go in the other direction, namely to prove that if G is the in- 
ternal direct product of its normal subgroups N,, No,..., N,,, then G is 1so- 
morphic to N, X N, X --: X N,.To do so, we first get some preliminary results. 

The result we are about to prove has already occurred as Problem 20, 
Section 5. For the sake of completeness we prove it here. 


Lemma 2.9.2. Let G be a group, M, N normal subgroups of G such 
that MO N = (e). Then, given m € M andn € N, mn = nm. 


Proof. Consider the element a = mnm™'n~'. Viewing a as bracketed 
one way, a = (mnm™')n7!; then, since N <] G andn € N, mnm™' € N, so 
a = (mnm“')n7' is also in N. Now bracket a in the other way, a = m(nm™'n“'), 
Since M <1] G and m=! € M, we have nm™!n7'! € M and so a = 
m(nm"'n~') © M. Thus a € MO N = (e), which is to say, mnm™'n“' = e. 
This gives us that mn = nm, as required. L] 


If G is the internal direct product of the normal subgroups N,, 
N2,..., N,, we claim that N; N N,; = (e) for i # j. For suppose that 
aGN,ON;,; then a = e-e--- eae--- e, where the a occurs in the ith place. 
This gives us one representation of ain G = N,N, -:- N,,. On the other hand, 
a=e-e:::e-:a-e:::e, where the a occurs in the jth place, so a has the second 
representation as an element of N,N, --- N,,. By the uniqueness of the repre- 
sentation, we get a = e, and so N; M N; = (e). 

Perhaps things would be clearer if we do it for n = 2. So suppose that 
N, 1G, N, <I G, and every element a € G has a unique representation as 
a = a,: ay, where a, € N,, a, © N>. Suppose that a € N, 1 N,; then a = 
a-eisarepresentation of a = a,-a, witha, =a €N,,a, =e € N,. How- 
ever a = e:a,soa = b,: by, where b; =e E€ N,, b, =a E Ny. By the 
uniqueness of the representation we must have a, = b,, that is, a = e. 
So N,N, = (e). 

The argument given above for N,,..., N,, is the same argument as that 
given for n = 2, but perhaps is less transparent. At any rate we have proved 


Lemma 2.9.3. If G is the internal direct product of its normal sub- 
groups N,, N,,..., N,, then, fori # j, N;N N; = (e). 


Corollary. If G is as in Lemma 2.9.3, then if i # j and a; € N; and 
a; © N;, we have a;a; = a,q;. 
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Proof. By Lemma 2.9.3, N; O N; = (e) for i # j. Since the N’s are nor- 
mal in G, by Lemma 2.9.2 we have that any element in N; commutes with any 
element in Nj, that 1S, aja; = aja; for a; e N,, aj eS N;. LJ 


With these preliminaries out of the way we can now prove 


Theorem 2.9.4, Let G be a group with normal subgroups N,, N2,..., 
N,. Then the mapping (a,, a,,..., @,) = @,@, °°: a, is an isomorphism 
from N, X N, X-:-: X N, (external direct product) onto G if and only if G is 
the internal direct product of N,, N2,...,N,. 


Proof. Suppose G is an internal direct product of N,,..., N,. Since 
every element a in G has a representation a = a,a,---a,, with the a; € N,, 
we have that the mapping w& is onto. We assert that it is also 1-1. For if 


W((a,;, 4,.--, An)) = wW((b;, bo,..., b,)), then by the definition of y, 
a,a,:-:a, = b,b,:-:b,. By the uniqueness of the representation of an ele- 
ment in this form we deduce that a, = b,, a = b,, ... , a, = b,. Hence wis 1-1. 


All that remains is to show that y& is a homomorphism. So, consider 


W((ay, An,+++, a,)(b,, bo, i dea b,)) — s((a,b,, Arbo, ae a,D,)) 
(a,b,)(a2b2) ++ * (@,5,) 


= a,b,a,b, eas a,,D 


Since b, € N,, it commutes with a;, b; for i > 1 by the Corollary to Lemma 
2.9.3. So we can pull the 5, across all the elements to the right of it to get 
a,b,a,b,:+- a,b, = a,a,b,a3b;:--+a,b,b,. Now repeat this procedure with bp, 
and so on, to get that a,b,a,b,--+-a,b, = (a,a,-::a,)(b,b2---b,). Thus 


W((d,,a2,..-,An)(Dy, bz,...,5,)) = aybya.b2 +++ a,b, 
= (aa, ° ++ a,)(b,b,°- * b,) 
= W((ay, Az,.--, An)) W((bibo, Re b,,)). 


In other words, wis a homomorphism. 

On the other hand, suppose that # is an isomorphism. Then the conclu- 
sion that G is the internal direct product of N,, N.,..., N, easily follows 
from the fact that wis onto and 1-1. 

With this the proof of Theorem 2.9.4 is complete. 1 


Corollary. Let G be a group with normal subgroups NV,, N,. Then G 
is the internal direct product of N, and N, if and only if G = N,N, and 
N, M N, = (e). 
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Proof. This follows easily from the fact that &: N, * N,— G, which is 
given by W(a;, a.) = a,a,, is an isomorphism if and only if N,N, = G and 
N, ) N, = (e). LJ 


In view of the result of Theorem 2.9.4 and its corollary, we drop the ad- 
jectives “internal” and “external” and merely speak about the “direct prod- 
uct.” When notation G = N, X N, 1s used it should be clear from context 
whether it stands for the internal or external direct product. 

The objective is often to show that a given group is the direct product 
of certain normal subgroups. If one can do this, the structure of the group 
can be completely determined if we happen to know those of the normal 
subgroups. 


PROBLEMS 


1. If G, and G, are groups, prove that G, X G, = G, X G,. 

2. If G; and G, are cyclic groups of orders m and n, respectively, prove that 
G, X G, is cyclic if and only if m and 7 are relatively prime. 

3. Let G be a group, A = G X G. In A let T = {(g, g)| g € G}. 

(a) Prove that 7 = G. 
(b) Prove that T <A if and only if G is abelian. 

4. Let G be an abelian group of order p7"!p3"2 + - + pz’*, where p1, P2,---5 Dx 
are distinct primes and m, > 0, m, > 0,..., m, > 0. By Problem 10 of 
Section 6, for each i, G has a subgroup P; of order p/”. Show that 
G=P,X P,X-°++-X Py. 

5. Let G be a finite group, N,, N.,..., N, normal subgroups of G such that 
G=N,N,°:::N,and|G| = |N,| |N.|---|N,|. Prove that G is the direct 
product of N,, No,..., Nx. 

6. Let G be a group, N,, N>,..., N, normal subgroups of G such that: 

1 G=N,N,::-N,. 
2. For each i, N;N (N,N2°°: Nj-1Ni41°°* Ny) = (e). 
Prove that G is the direct product of N,, No,..., Nx. 


10. FINITE ABELIAN GROUPS (OPTIONAL) 


We have just finished discussing the idea of the direct product of groups. If 
we were to leave that topic at the point where we ended, it might seem like 
a nice little construction, but so what? To give some more substance to it, 


Sec. 10 Finite Abelian Groups (Optional) 97 


we should prove at least one theorem which says that a group satisfying a 
certain condition is the direct product of some particularly easy groups. For- 
tunately, such a class of groups exists, the finite abelian groups. What we 
shall prove is that any finite abelian group is the direct product of cyclic 
groups. This reduces most questions about finite abelian groups to questions 
about cyclic groups, a reduction that often allows us to get complete an- 
swers to these questions. 

The results on the structure of finite abelian groups are really special 
cases of some wider and deeper theorems. To consider these would be going 
too far afield, especially since the story for finite abelian groups is so impor- 
tant in its own right. The theorem we shall prove is called the Fundamental 
Theorem on Finite Abelian Groups, and rightfully so. 

Before getting down to the actual details of the proof, we should like to 
give a quick sketch of how we shall go about proving the theorem. 

Our first step will be to reduce the problem from any finite abelian 
group to one whose order is p”, where p is a prime. This step will be fairly 
easy to carry out, and since the group will have order involving just one 
prime, the details of the proof will not be cluttered with elements whose or- 
ders are somewhat complicated. 

So we shall focus on groups of order p”. Let G be an abelian group of 
order p”. We want to show that there exist cyclic subgroups of G, namely 
A,, A>,...,A,, such that every element x © G can be written as x = 
b,b,-::b,, where each b; € A;, in a unique way. Otherwise put, since 
each A, is cyclic and generated by a;, say, we want to show that x = 
ay'a;'2 + + - a’*, where the elements a7” are unique. 

A difficulty appears right away, for there is not just one choice for these 
elements a,,..., a,. For instance, if G is the abelian group of order 4 with 
elements e, a, b, ab, where a” = b” = e and ab = ba, then we can see that if 
A, B, C are the cyclic subgroups generated by a, b, and ab, respectively, then 
G=AXB=AXC=B~XC.So there is a lack of uniqueness in the choice of 
the a;. How to get around this? 

What we need is a mechanism for picking a, and which, when applied 
after we have picked aj, will allow us to pick a,, and so on. What should this 
mechanism be? Our control on the elements of G lies only in specifying their 
orders. It is the order of the element—when properly used—that will give us 
the means to prove the theorem. 

Suppose that G = A, X A, X --- X A,, where |G| = p” and the A’s 
have been numbered, so that |A;| = p”: and n,; =n, =--- = nx, and each A; 
is cyclic generated by a;. If this were so and x = aj"! - - - az’*, then 


xe = (az ees airkyP™ = aye gee" ae ae” 
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because n, = n,, p”‘| p”, so since every a’"'?"' = e, thus x?"' = e. In other 
words, a, Should be an element of G whose order is as large as it can possibly 
be. Fine, we can now pick a,. What do we do for a,? If G = G/A,, then to 
get the first element needed to represent G as a direct product of cyclic 
groups, we should pick an element in G whose order is maximal. What does 
this translate into in G itself? We want an element a, such that a, requires as 
high a power as possible to fall into A,. So that will be the road to the selec- 
tion of the second element. However, if we pick an element a, with this 
property, it may not do the trick; we may have to adapt it so that it will. The 
doing of all this is the technical part of the argument and does go through. 
Then one repeats it appropriately to find an element a;, and so on. 

This is the procedure we shall be going through to prove the theorem. 
But to smooth out these successive choices of a,, a,,..., we shall use an in- 
duction argument and some subsidiary preliminary results. 

With this sketch as guide we hope the proof of the theorem will make 
sense to the reader. One should not confuse the basic idea in the proof— 
which is quite reasonable—with the technical details, which may cloud the 
issue. So we now begin to fill in the details of the sketch of the proof that we 


outlined above. 


Lemma 2.10.1. Let G be a finite abelian group of order mn, where m 
and n are relatively prime. If M = {x € G|x” =e} and N = {x € G| x" = e}, 
then G = M X N. Moreover, if neither m nor n is 1, then M # (e) and N # (e). 


Proof. The sets M and N defined in the assertion above are quickly 
seen to be subgroups of G. Moreover, if m # 1, then by Cauchy’s Theorem 
(Theorem 2.6.4) we readily obtain M # (e), and similarly if n # 1, that 
N # (e). Furthermore, since MM N is a subgroup of both M and N, by La- 
grange’s Theorem, |M 1 N| divides |M| = m and |N| = n. Because m and n 
are relatively prime, we obtain |MM N| = 1, hence MN N = (e). 

To finish the proof, we need to show that G = MN andG=M XN. 
Since m and n are relatively prime, there exist integers r and s such that 
rm + sn = 1. If a € G, then a = a! = a’"*"™ = ai"q"™; since (a°")” = 
ae"™ = e, we have that a*” € M. Similarly, a’” © N. Thus a = a’"a’™ is in 
MN. In this way G = MN. It now follows from Corollary to Theorem 2.9.4 
thatG=MxN.(] 


An immediate consequence is the 


Corollary. Let G be a finite abelian group and let p be a prime such 
that p divides |G|. Then G = P X T for some subgroups P and 7, where 
|P| = p™, m > 0, and |7| is not divisible by p. 
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Proof. Let P = {x € G|x?’ = e for some s} and let the subset 
T = {x € G| x' =e for t relatively prime to p}. By Lemma 2.10.1, G = P x T 
and P # (e). Since every element in P has order a power of p, | P| is not divis- 
ible by any other prime (by Cauchy’s Theorem), so | P| = p” for some m. 

It is easy to see that p||7| by making use of Lagrange’s Theorem. Thus 
we really have that P is not merely some subgroup of G but is what is called 
a p-Sylow subgroup of G. (See Section 11). J 


We now come to the key step in the proof of the theorem we seek. 
The proof is a little difficult, but once we have this result the rest will be 
easy. 


Theorem 2.10.2. Let G be an abelian group of order p”, p a prime, 
and let a © G have maximal order of all the elements in G. Then G = 
A X Q, where A is the cyclic subgroup generated by a. 


Proof. We proceed by induction on n. If n = 1, then |G| = p and G is 
already a cyclic group generated by any a # ein G. 

We suppose the theorem to be true for all m <n. We first show that 
the theorem is correct if there exists an element b € G such that b € A = (a) 
and b? = e. Let B = (b), the subgroup of G generated by b; thus 
A B = (e) (see Problem 1). 

Let G = G/B; by assumption B # (e), hence |G| < |G|. In G, what is 
the order of a = Ba? We claim that o(@) = o(a). To begin with, we know 
that o(@) | o(a) (see Problem 6 of Section 2.7). On the other hand, a°“¢) = @, 
so a?) € B. Since a?“ € A, we see that a? € AM B = (e), whence 
a°) = e, This tells us that o(a) | o(a). Hence o(a) = o(@). 

Since @ is an element of maximal order in G, by the induction we know 
that G = (@) X T for some subgroup T of G. By the Correspondence 
Theorem we also know that T = Q/B for some subgroup Q of G. We claim 
that G is the internal direct product A xX Q. That G = AQ is left to 
the reader. It remains to show that A M QO = (e). Let a’ € A NM Q. Then 
a' © Q/B = T, and since (a) N T = (@), we have that a‘ = @. But since o(a) = 
o(a), this implies a’ = e. Therefore, A M Q = (e) and we obtain that 
G=AXQ. 

Suppose, then, that there is no element 5b in G, b not in A, such that 
b? = e. We claim that this forces G = A = (a), in which case G is a cyclic 
group. Suppose that G # A and let x € G, x € A have smallest possible 
order. Because 0(x’) < o(x), we have, by our choice of x, that x? € A, hence 
x? = q' for some i. 

We claim that p|i. Let o(a) = p*, and note that the maximality 
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of the order of a implies that x?° = e. But x?° = (x?)?*"! = (a')?*"' = e. Since 
o(a) = p*, we have p |i. 

Thus x? = a', where p |i. Letty =a‘? - x. Theny? =a ‘x? =a ‘a' =e. 
Moreover, y & (a) = A, because x € A. But this puts us back in the situation 
discussed above, where there exists ab € G, b € A such that b? = e; in that 
case we saw that the theorem was correct. So we must have G = (a), and G 
is a cyclic group. This finishes the induction and proves the theorem. [_] 


We are now able to prove the very basic and important 


Theorem 2.10.3 (Fundamental Theorem on Finite Abelian Groups). 
A finite abelian group is the direct product of cyclic groups. 


Proof. Let G be a finite abelian group and p a prime that divides |G]. 
By the Corollary to Lemma 2.10.1, G = P X T, where |P| = p”. By Theorem 
2.10.2, P = A, X A, X-:: X A,, where the A; are cyclic subgroups of P. Ar- 
guing by induction on |G|, we may thus assume that T = T, X T) X --- X Ty, 
where the 7; are cyclic subgroups of 7. Thus 


G = (A, X A, X ++: X A,) X(T, X Tp X +++ X T,) 
=A,XA,X°*: XA, XT, X T,X ++: X Ty. 


This very important theorem is now proved. L] 


We return to abelian groups G of order p”. We now have at hand that 
G =A, X A, X°::: X A,, where the A; are cyclic groups of order p”. We 
can arrange the numbering so that n, = n, = -:: = n,. Also, |G| = 
|A, X A, X---X A,| = |A,| |A.|---|A,|, which gives us that 


Nytngt:+ngz 
9 


p” = p"'p”? oe “p* =p 
hence n =n, + n, +--+ + -n,. Thus the integers n; = 0 give us a partition of 
n. It can be shown that these integers n,, n2,..., ,—which are called the 
invariants of G—are unique. In other words, two abelian groups of order p” 
are isomorphic if and only if they have the same invariants. Granted this, it 
follows that the number of nonisomorphic abelian groups of order p” is equal 
to the number of partitions of n. 

For example, if nm = 3, it has the following three partitions: 3 = 3, 3 = 
2+ 1,3 = 1+ 1+ 1, so there are three nonisomorphic abelian groups of 
order p° (independent of p). The groups corresponding to these partitions 
are a cyclic group of order p°, the direct product of a cyclic group of order p” 
by one of order p, and the direct product of three cyclic groups of order p, 
respectively. 
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For n = 4 we see the partitions are 4 = 4,4 =3+1,4=2+2,4= 
2+1+41,4=1+1+1 +1, which are five in number. Thus there are five 
nonisomorphic groups of order p*. Can you describe them via the partitions 
of 4? 

Given an abelian group of order n = p{'!p5?--- pZ*, where the p; are 
distinct primes and the aq; are all positive, then G is the direct product of its 
so-called p;— Sylow subgroups (see, e.g., the Corollary to Lemma 2.10.1). For 
each prime p; there are as many groups of order p#‘ as there are partitions of 
a;. So the number of nonisomorphic abelian groups of order n = p{!- - - pik 
is f(a,)f(a,) -:- f(a,), where f(m) denotes the number of partitions of m. 
Thus we know how many nonisomorphic finite abelian groups there are for 
any given order. 

For instance, how many nonisomorphic abelian groups are there of 
order 144? Since 144 = 2437, and there are five partitions of 4, two partitions 
of 2, there are 10 nonisomorphic abelian groups of order 144. 

The material treated in this section has been hard, the path somewhat 
tortuous, and the effort to understand quite intense. To spare the reader too 
much further agony, we assign only three problems to this section. 


PROBLEMS 


1. Let A be a normal subgroup of a group G, and suppose that b € Gis an 
element of prime order p, and that b € A. Show that A M (b) = (e). 

2. Let G be an abelian group of order p”, p a prime, and let a € G have max- 
imal order. Show that x° = e for all x € G. 

3. Let G be a finite group, with N ( G anda € G. Prove that: 
(a) The order of aN in G/N divides the order of a in G, that 1s, 

o(aN) | o(a). 

(b) If (a) N N = (e), then o(aN) = o(a). 


11. CONJUGACY AND SYLOW’S THEOREM (OPTIONAL) 


In discussitig equivalence relations in Section 4 we mentioned, as an example 
of such a relation in a group G, the notion of conjugacy. Recall that the ele- 
ment b in G is said to be conjugate to a € G (or merely, a conjugate of a) if 
there exists an x € G such that b = x ‘ax. We showed in Section 4 that this 
defines an equivalence relation on G. The equivalence class of a, which we 
denote by cl(a), is called the conjugacy class of a. 
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For a finite group an immediate question presents itself: How large 1s 
cl(a)? Of course, this depends strongly on the element a. For instance, if 
a € Z(G), the center of G, then ax = xa for all x € G, hence x ‘ax = a; in 
other words, the conjugacy class of a in this case consists merely of the ele- 
ment a itself. On the other hand, if cl(a) consists only of the element a, then 
x lax = a for all x € G. This gives us that xa = ax for all x € G, hence 
a € Z(G). So Z(G) is characterized as the set of those elements a in G 
whose conjugacy class has only one element, a itself. 

For an abelian group G, since G = Z(G), two elements are conjugate if 
and only if they are equal. So conjugacy is not an interesting relation for 
abelian groups; however, for nonabelian groups it is a highly interesting no- 
tion. 

Given a € G, cl(a) consists of all x~'ax as x runs over G. So to deter- 
mine which are the distinct conjugates of a, we need to know when two con- 
jugates of a coincide, which is the same as asking: When is x" ‘ax = y~ ‘ay? In 
this case, transposing, we obtain a(xy_') = (xy~')a; in other words, xy™' 
must commute with a. This brings us to a concept introduced as Example 10 
in Section 3, that of the centralizer of a in G. We repeat something we did 
there. 


Definition. If a € G, then C(a), the centralizer of a in G, is defined 
by C(a) = {x € G|xa = ax}. 


When C(a) arose in Section 3 we showed that it was a subgroup of G. 
We record this now more officially as 


Lemma 2.11.1. For a € G, C(a) is a subgroup of G. 


As we saw above, the two conjugates x~'ax and y ‘ay of a are equal 


only if xy" ' € C(a), that is, only if x and y are in the same right coset of C(a) 
in G. On the other hand, if x and y are in the same right coset of C(a) in G, 
then xy" ' € C(a), hence xy ‘a = axy '. This yields that x~'ax = y ‘ay. Sox 
and y give rise to the same conjugate of a if and only if x and y are in the 
same right coset of C(a) in G. Thus there are as many conjugates of a in G as 
there are right cosets of C(a) in G. This is most interesting when G is a finite 
sroup, for in that case the number of right cosets of C(a) in G is what we 
called the index, ig (C(a)), of C(a) in G, and is equal to |G|/|C(a)|. 
We have proved 


Theorem 2.11.2. Let G be a finite group and a € G;; then the number 
of distinct conjugates of a in G equals the index of C(a) in G. 
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In other words, the number of elements in cl(a) equals ig(C(a)) = 
IG|/|C(a)|. 

This theorem, although it was relatively easy to prove, is very impor- 
tant and has many consequences. We shall see a few of these here. 

One such consequence is a kind of bookkeeping result. Since conjugacy 
is an equivalence relation on G, G is the union of the disjoint conjugacy 
classes. Moreover, by Theorem 2.11.2, we know how many elements there 
are in each class. Putting all this information together, we get 


Theorem 2.11.3 (The Class Equation). If G is a finite group, then 


Gl = Dic(C@) = Dey 


where the sum runs over one a from each conjugacy class. 


It is almost a sacred tradition among mathematicians to give, as the first 
application of the class equation, a particular theorem about groups of order 
p”, where p is a prime. Not wanting to be accused of heresy, we follow this 
tradition and prove the pretty and important 


Theorem 2.11.4. If Gis a group of order p”, where p is a prime, then 
Z(G), the center of G, is not trivial (i.e., there exists an element a # ein G 
such that ax = xa for allx € G). 


Proof. We shall exploit the class equation to carry out the proof. Let 
z = |Z(G)|; as we pointed out previously, z is then the number of elements 
in G whose conjugacy class has only one element. Since e € Z(G), z = 1. 
For any element b outside Z(G), its conjugacy class contains more than one 
element and |C(b)| < |G|. Moreover, since |C(b)| divides |G| by Lagrange’s 
theorem, |C(b)| = p”, where 1 < n(b) <n. We divide the pieces of the 
class equation into two parts: that coming from the center, and the rest. We 
get, this way, 


n G ‘ ni—n 
p"=|Gl=z+ > cares Dy pam zt > pe’, 


bEZ(G) n(b)<n n(b)<n 
Clearly, p divides the left-hand side, p”, and divides Se aaa The net 
result of this is that p | z, and since z = 1, we have that z is at least p. So since 
z =|Z(G)|, there must be an element a + e in Z(G), which proves the theo- 
rem. LJ 


This last theorem has an interesting application, which some readers 
may have seen in solving Problem 45 of Section 5. This is 
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Theorem 2.11.5. If G is a group of order p’*, where p is a prime, then 
G is abelian. 


Proof. By Theorem 2.11.4, Z(G) # (e), so that there is an element, a, 
of order p in Z(G). If A = (a), the subgroup generated by a, then A C Z(G), 
hence A C C(x) for all x € G. Given x € G, x € A, then C(x) D A and 
x € C(x); so |C(x)| > p, yet |C(x)| must divide p’. The net result of this is 
that |C(x)| = p*, so C(x) = G, whence x € Z(G). Since every element of G 
is in the center of G, G must be abelian. L] 


In the problems to come we shall give many applications of the nature 
of groups of order p”, where p is a prime. The natural attack on virtually all 
these problems follows the lines of the argument we are about to give. We 
choose one of a wide possible set of choices to illustrate this technique. 


Theorem 2.11.6. If G is a group of order p”, p a prime, then G con- 
tains a normal subgroup of order p”"'. 


Proof. We proceed by induction on n. If n = 1, then G is of order p 
and (e) is the required normal subgroup of order p'~' = p® = 1. 

Suppose that we know that for some k every group of order p* has a 
normal subgroup of order p*~!. Let G be of order p**'; by Theorem 2.11.4 
there exists an element a of order p in Z(G), the center of G. Thus the sub- 
group A = (a) generated by a is of order p and is normal in G. Consider [' = 
G/A;T is a group of order |G|/|A| = p**'/p = p* by Theorem 2.6.3. Since I 
has order p*, we know that I has a normal subgroup M of order p*~'. Since 
I is ahomomorphic image of G, by the Correspondence Theorem (Theorem 
2.7.2) there is a normal subgroup N in G, N 2D A, such that N/A = M. But 
then we have 

pi! = |M| = |N/A| = 2, 
that is, p* ' = |N|/p, leading us to |N| = p*. Thus N is our required normal 
subgroup in G of order p*. This completes the induction and so proves the 
theorem. [_] 


By far the most important application we make of the class equation is 
the proof of a far-reaching theorem due to Sylow, a Norwegian mathemati- 
cian, who proved it in 1871. We already showed this theorem to be true for 
abelian groups. We shall now prove it for any finite group. It is impossible to 
overstate the importance of Sylow’s Theorem in the study of finite groups. 
Without it the subject would not get off the ground. 
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Theorem 2.11.7 (Sylow’s Theorem). Suppose that G is a group of 
order p"m, where p is a prime and p{ m. Then G has a subgroup of order p”. 


Proof. If n = 0, this is trivial. We therefore assume that n = 1. Here, 
again, we proceed by induction on |G|, assuming the result to be true for all 
groups H such that |H| < |G]. 

Suppose that the result is false for G. Then, by our induction hypothe- 
sis, p” cannot divide |H| for any subgroup H of G if H # G. In particular, if 
a € Z(G), then C(a) # G, hence p"{|C(a)|. Thus p divides |G|/|C(a)| = 
ig(C(a)) fora € Z(G). 

Write down the class equation for G following the lines of the argu- 
ment in Theorem 2.11.4. If z = |Z(G)|, then z = 1 and 


p'm =|G|=z+ > i(C(a)). 
a€éZ(G) 


But p|i¢(C(a)) if a € Z(G), so p | ZaezG) ig(C(a)). Since p | p"m, we get 
p|z. By Cauchy’s Theorem there is an element a of order p in Z(G). If A is 
the subgroup generated by a, then |A| = p and A <JG, since a € Z(G). Con- 
sider TF = G/A; |T| = |G|/|A| = p"m/p = p” 'm. Since || < |G|, by our in- 
duction hypothesis [ has a subgroup M of order p” '. However, by the Cor- 
respondence Theorem there is a subgroup P of G such that P D A and 
PIA = M. Therefore, |P| = |M||A| = p""'p = p” and P is the sought-after 
subgroup of G of order p”, contradicting our assumption that G had no such 
subgroup. This completes the induction, and Sylow’s Theorem is established. [_] 


Actually, Sylow’s Theorem consists of three parts, of which we only 
proved the first. The other two are (assuming p"m = |G|, where p | m): 


1. Any two subgroups of order p” in G are conjugate; that is, if |P| = 
|O| = p” for subgroups P, Q of G, then for some x € G, Q = x™'Px. 

2. The number of subgroups of order p” in G 1s of the form 1 + kp and di- 
vides |G|. 


Since these subgroups of order p” pop up all over the place, they are 
called p-Sylow subgroups of G. An abelian group has one p-Sylow subgroup 
for every prime p dividing its order. This is far from true in the general case. 
For instance, if G = $3, the symmetric group of degree 3, which has order 6 
= 2-3, there are three 2-Sylow subgroups (of order 2) and one 3-Sylow sub- 
group (or order 3). 

For those who want to see several proofs of that part of Sylow’s Theo- 
rem which we proved above, and of the other two parts, they might look at 
the appropriate section of our book Topics in Algebra. 
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PROBLEMS 
Easier Problems 


1. In S;, the symmetric group of degree 3, find all the conjugacy classes, and 
check the validity of the class equation by determining the orders of the 
centralizers of the elements of S;. 

. Do Problem 1 for G the dihedral group of order 8. 

. Ifa € G, show that C(x~'ax) = x7 'C(a)x. 

. If gis an automorphism of G, show that C(¢(a)) = ¢(C(a)) fora € G. 

. If |G| = p°’ and |Z(G)| = p’, prove that G is abelian. 

. If P is a p-Sylow subgroup of G and P <j G, prove that P is the only 
p-Sylow subgroup of G. 

7. If P <1 G, P a p-Sylow subgroup of G, prove that g(P) = P for every 
automorphism ¢ of G. 
8. Use the class equation to give a proof of Cauchy’s Theorem. 

If H is a subgroup of G, let N(H) = {x € G|x~'Hx = H}. This does 
not mean that xa = ax whenever x € N(H), a € H. For instance, if 
H<1G, then N(#Z) = G, yet H need not be in the center of G. 

9. Prove that N(/7) is a subgroup of G, H C N(#7) and in fact H <1 N(#2). 
10. Prove that N(x~'Hx) = x~'N(A)x. 
11. If P is a p-Sylow subgroup of G, prove that P is a p-Sylow subgroup of 
N(P) and is the only p-Sylow subgroup of N(P). 

12. If P is a p-Sylow subgroup and a € Gis of order p™ for some m, show 
that if a” 'Pa = Pthena € P. 

13. Prove that if G is a finite group and H is a subgroup of G, then the num- 
ber of distinct subgroups x 'Hx of G equals ig (N(A)). 

14, If P is a p-Sylow subgroup of G, show that the number of distinct x~ ‘Px 
cannot be a multiple of p. 

15. If N < G, let B(N) = {x € G|xa = ax for all a € N}. Prove that 
B(N) AAG. 


Nn & WwW WN 


Middle-Level Problems 


16. Show that a group of order 36 has a normal subgroup of order 3 or 9. 
(Hint: See Problem 40 of Section 5.) 


17. Show that a group of order 108 has a normal subgroup of order 9 or 27. 
18. If P is a p-Sylow subgroup of G, show that N(N(P)) = N(P). 
19. If |G| = p”, show that G has a subgroup of order p” for all 1 < m <n. 
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20. If p” divides |G|, show that G has a subgroup of order p”. 

21. If |G| = p” and H # Gis a subgroup of G, show that N(H) 2 H. 

22. Show that any subgroup of order p”"' in a group G of order p” is normal 
in G. 


Harder Problems 


23. Let G be a group, H a subgroup of G. Define for a,b € G,a ~ bifb = 
h~'ah for some h € H. Prove that 
(a) this defines an equivalence relation on G. 

(b) If [a] is the equivalence class of a, show that if G is a finite group, 
then [a] has m elements where m is the index of H M C(a) in H. 

24. If G is a group, H a subgroup of G, define a relation B ~ A for sub- 
groups A, B of G by the condition that B = h”"'Ah for some h € H. 

(a) Prove that this defines an equivalence relation on the set of sub- 
groups of G. 

(b) If G is finite, show that the number of distinct subgroups equivalent 
to A equals the index of N(A) N Hin H. 

25. If P is a p-Sylow subgroup of G, let S be the set of all p-Sylow subgroups 
of G. For O,, Q, € S define Q, ~ QO, if O, = a"'Q,a witha € P. Prove, 
using this relation, that if Q # P, then the number of distinct a” 'Qa, with 
a € P, isa multiple of p. 

26. Using the result of Problem 25, show that the number of p-Sylow sub- 
groups of G is of the form 1 + kp. (This is the third part of Sylow’s Theo- 
rem.) 

27. Let P be a p-Sylow subgroup of G, and Q another one. Suppose that 
O # x 'Px for any x € G. Let S be the set of all y-'Qy, as y runs over G. 
For Q,, QO, € S define 0, ~ Q, if QO, = a 'Q,a, where a € P. 

(a) Show that this implies that the number of distinct y~'Qy is a multiple 
of p. 

(b) Using the result of Problem 14, show that the result of Part (a) can- 
not hold. 

(c) Prove from this that given any two p-Sylow subgroups P and Q of G, 
then OQ = x 'Px for some x € G. 
(This is the second part of Sylow’s Theorem.) 

28. If H is a subgroup of G of order p” show that H is contained in some 
p-Sylow subgroup of G. 

29. If P is a p-Sylow subgroup of G and a, b € Z(P) are conjugate in G, 
prove that they are already conjugate in N(P). 


3 


THE SYMMETRIC GROUP 


1. PRELIMINARIES 


Let us recall a theorem proved in Chapter 2 for abstract groups. This result, 
known as Cayley’s Theorem (Theorem 2.5.1), asserts that any group G is iso- 
morphic to a subgroup of A(S), the set of 1-1 mappings of the set S onto it- 
self, for some suitable S. In fact, in the proof we gave we used for S the 
group G itself viewed merely as a set. 

Historically, groups arose this way first, long before the notion of an 
abstract group was defined. We find in the work of Lagrange, Abel, Galois, 
and others, results on groups of permutations proved in the late eighteenth 
and early nineteenth centuries. Yet it was not until the mid-nineteenth cen- 
tury that Cayley more or less introduced the abstract concept of a group. 

Since the structure of isomorphic groups is the same, Cayley’s Theorem 
points out a certain universal character for the groups A(S). If we knew the 
structure of all subgroups of A(S) for any set S, we would know the structure 
of all groups. This is much too much to expect. Nevertheless, one could try to 
exploit this embedding of an arbitrary group G isomorphically into some 
A(S). This has the advantage of transforming G as an abstract system into 
something more concrete, namely a set of nice mappings of some set onto 
itself. 

We shall not be concerned with the subgroups of A(S) for an arbitrary 
set S. If S is infinite, A(S) is a very wild and complicated object. Even if S is 
finite, the complete nature of A(S) is virtually impossible to determine. 
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In this chapter we consider only A(S) for S a finite set. Recall that if § 
has n elements, then we call A(S) the symmetric group of degree n, and de- 
note it by S,. The elements of S, are called permutations; we shall denote 
them by lowercase Greek letters. 

Since we multiplied two elements o, t © A(S) by the rule (o7)(s) = 
a(7(s)) this will have the effect that when we introduce the appropri- 
ate symbols to represent the elements of S,,, these symbols, or permutations, 
will multiply from right to left. If the readers look at some other book on al- 
gebra, they should make sure which way the permutations are being multi- 
plied: right to left or left to right. Very often, algebraists multiply permuta- 
tions from left to right. To be consistent with our definition of the 
composition of elements in S,, we do it from right to left. 

By Cayley’s Theorem we know that if G is a finite group of order n, 
then G is isomorphic to a subgroup of S, and S, has n! elements. Speaking 
loosely, we usually say that G is a subgroup of S,,. Since 1 is so much smaller 
than n! for n even modestly large, our group occupies only a tiny little corner 
in §,,. It would be desirable to embed G in an S,, for n as small as possible. For 
certain classes of finite groups this is achievable in a particularly nice way. 

Let S be a finite set having n elements; we might as well suppose that 
S = {x,, X2,..., x,}. Given the permutation o € S, = A(S), then 
o(x,) € S fork = 1,2,...,n, so o(x,) = x;, for some i,, 1 = i, = n. Because 
o is 1-1, if 7 # k, then x; = o(x)) # o(x,) = x;,. Therefore, the numbers 
11, 1),...,1, are merely the numbers 1, 2,..., shuffled about in some order. 

Clearly, the action of a on S is determined by what o does to the sub- 
script j of x;, so the symbol “x” is really excess baggage and, as such, can be 
discarded. In short, we may assume that S = {1, 2,..., 7}. 

Let’s recall what is meant by the product of two elements of A(S). If 
ao, T€ A(S), then we defined or by (a7)(s) = a(7(s)) for every s € S. We 
showed in Section 4 of Chapter 1 that A(S) satisfied four properties that we 
used later as the model to define the notion of an abstract group. Thus S,, in 
particular, is a group relative to the product of mappings. 

Our first need is some handy way of denoting a permutation, that is, an 
element ao in S,. One clear way is to make a table of what a does to each ele- 
ment of S. This might be called the graph of o. We did this earlier, writing 
out o, say 0 € S3, in the fashion: 0: x; > x2, X, > x3, X3 — x,. But this is 
cumbersome and space consuming. We certainly can make it more compact 
by dropping the x’s and writing 0 = t : 1 In this symbol the number in 
the second row is the image under o of the number in the first row directly 
above it. There is nothing holy about 3 in all this; it works equally well for 
any 7. 
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If o € S, and o(1) = i,, o(2) = 1,,..., a(n) = i, we use the symbol 

t é _ 4 to represent o and we writeo =[._ "1 \ Note 
ly ly oes L, ly ls oo ee e@ L, 

that it is not necessary to write the first row in the usual order1 2 -:: 7; 


any way we write the first row, as long as we carry the i,’s along accord- 
ingly, we still have o. For instance, in the example in 5S; cited, 


-f1 2 3\/8 1 2/2 1 3 
SN a. 4 1-9 3 3: Oo ay 
1 2 ++: 4H 
edge PP ae 
i, i, 


If we know o = , what is 0 '? It is easy, just flip the 


symbol for o over to get a! = : . (Prove!) In our example 


1 2 
_f1 2 3 fe Oo) ek. 2s OS 
c= ( 3 :} oc = (; 5 :) = E 1 5 The identity element— 
. . L 2 ee 
which we shall write as e—is merely e = 12. nh 


How does the product in S,, translate in terms of these symbols? Since 
oT means: “First apply 7 and to the result of this apply o,” in forming the 
product of the symbols for o and 7 we look at the number & in the first row 
of 7 and see what number i, is directly below k in the second row of t. We 
then look at the spot i, in the first row of o and see what is directly below it 
in the second row of a. This is the image of k under o7. We then run through 
k =1,2,...,n and get the symbol for a7. We just do this visually. 

We illustrate this with two permutations 


Soles eee 
c= and T= 


23 15 4 3 45 12 
, _f1 2 3 4 °5 
in §;. Then o7 = 15 42 41 
Even the economy achieved this way is not enough. After all, the first 
row is always 1 2 -:-: n, so we could dispense with it, and write a = 
(; : ; 4 as (i;, i,,..., i,). This is fine, but in the next section we 
1 p n 


shall find a better and briefer way of representing permutations. 


PROBLEMS 


1. Find the products: 


Qi eS Oe ai = 
645 2 1 3/2 3 45 6 If) 
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wy (1 2 3 4 S\/1 23 4 5 
213 4 5N3 21 4 5)° 
(y (1 2 3 4 5)" (1 2.3 4 5 123 4 5 
[\4 13 2 5) \2 13 4 5/\4 13 2 5) 
2. Evaluate all the powers of each permutation oa (i.e., find o* for all k). 
i 123 45 6 
"lo 3°45 61) 
123 45 67 
(> | ree) 
(©) 123 45 6 
164521 3) 
© «© sae nn eo 
3. Prove that | 2 if = {1 2 in) 
ig sb Oe Dy LZ, Oe 


4. Find the order of each element in Problem 2. 
5. Find the order of the products you obtained in Problem 1. 


2. CYCLE DECOMPOSITION 


We continue the process of simplifying the notation used to represent a given 
permutation. In doing so, we get something more than just a new symbol; we 
get a device to decompose any permutation as a product of particularly nice 
permutations. 


Definition. Let i,,i,,..., i, be k distinct integers in S = {1, 2,..., n}. 
The symbol (i; 7, -::: i) will represent the permutation o € S,, where 
o(i,) = tn, oi) = 13,..., OG) = 4, for j < k, o() = i, and 
o(s) = s foranys € S ifs is different from 1,,7,,..., ix. 


Thus, in §S,7, the permutation (1 3 5 4) is the permutation 


123 45 67 
3.25 14 6 7 


a k-cycle. For the special case k = 2, the permutation (7, /,) is called a trans- 
position. Note that if 6 (i, L eats ix), then a is also (i; Ly l, aaa ly—1)s 
(i,-) i 4d, in +++ i, _»), and so on. (Prove!) For example, 

Qd3 5 4=4 13 5=6 4 1 3=@63 5 4 1). 


Two cycles, say a k-cycle and an m-cycle, are said to be disjoint cycles if 


) We call a permutation of the form (4, i, +-- ix) 
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they have no integer in common. Whence (1 3 5)and(4 2 6 7)inS, 
are disjoint cycles. 

Given two disjoint cycles in S,, we claim that they commute. We leave 
the proof of this to the reader, with the suggestion that if o, 7 are disjoint 
cycles, the reader should verify that (o7)(i) = (7o)(1) for every i © S = 
{1,2,...,}. We state this result as 


Lemma 3.2.1. Ifo, 7 © S, are disjoint cycles, then a7 = 7a. 


Let’s consider a particular k-cycle 0 = (1 2 ::: k)in S,. Clearly, 
a(1) = 2 by the definition given above; how is 3 related to 1? Since o(2) = 3, 
we have o°(1) = o(2) = 3. Continuing, we see that o/(1) = j + 1 for 
j =k — 1, while o* (1) = 1. In fact, we see that o* = e, where e is the identity 
element in S,. 

There are two things to be concluded from the paragraph above. 


1. The order of a k-cycle, as an element of S,,, is k. (Prove!) 

2. Ifo = (4; in ++: &) is a k-cycle, then the orbit of i; under a (see 
Problem 27 in Section 4 of Chapter 1) is {i,,i,,..., i, }. So we can see 
that the k-cycleo = (i, in ++: &)is 


g=(i, oft) o*(i,) oc o*~"(i,)). 


Given any permutation 7 in S,, for i € {1, 2,...,}, consider the orbit 
of i under r. We have that this orbii is {i, r(i), 77(i),..., 7° '(i)}, where 
T*(i) =i ands is the smallest positive integer with this property. Consider the 
s-cycle (i r(i) 7°(i) --- 7° ~'(i)); we call it the cycle of 7 determined by i. 

We take a specific example and find all its cycles. Let 


tree ete. 


9 


3941562 7 8 


what is the cycle of 7 determined by 1? We claim that itis (1 3 4). Why? 
tT takes 1 into 3, 3 into 4 and 4 into 1, and since 7(1) = 3, r°(1) = 7(3) = 4, 
7°(1) = 7(4) = 1. We can get this visually by weaving through 


with the thin path. What is the cycle of 7 determined by 2? Weaving 
through 
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with the thin path, we see that the cycle of + determined by 2 is 
(2 9 8 7). The cycles of 7 determined by 5 and 6 are (5) and (6), respec- 
tively, since 5 and 6 are left fixed by 7. So the cycles of 7 are (1 3 4), 
(2 9 8 7), (5), and (6). Therefore we have thatr=(1 3 4)(2 9 8 7) 
(5)(6), where we view these cycles—as defined above—as permutations in S, 
because every integer in S§ = {1, 2,..., 9} appears in one and only one cycle, 
and the image of any i under 7 1s read off from the cycle in which it appears. 

There is nothing special about the permutation 7 above that made the 
argument we gave go through. The same argument would hold for any per- 
mutation in S, for any n. We leave the formal writing down of the proof to 
the reader. 


Theorem 3.2.2. Every permutation in S, is the product of disjoint 
cycles. 


In writing a permutation o as a product of disjoint cycles, we omit all 
1-cycles; that is, we ignore the i’s such that o(i) = i. Thus we write 
ao = (1 2 3)(4 5)(6)(7) simply as o = (1 2 3)(4 5). In other words, writing 
o as a product of k-cycles, with k > 1, we assume that o leaves fixed any in- 
teger not present in any of the cycles. Thus in the group S,, the permuta- 
tiontr=(1 5 6)(2 3 9 8 7) leaves fixed 4, 10, and 11. 


Lemma 3.2.3. If 7 in S, is a k-cycle, then the order of 7 is k; that is, 
t* =eand 7! #efor0<j<k. 


Consider the permutation 7 = (1 2)(3 4 5 6)(7 8 9Q) in So. 
What is its order? Since the disjoint cycles (1 2),(3 4 5 6),(7 8 9) 
commute, 7” = (1 2)"(3 4 5 6)"(7 8 9)”; in order that 7” = e we 
need (1 2)” =e, (3 4 5 6)” =e,(7 8 9)” =e. (Prove!) To have 
(7 8 9)” = e, we must have 3|m, since (7 8 9) is of order 3; to have 
(3 4 5 6)” =e, we must have 4|m, because (3 4 5 6) is of order 4, 
and to have (1 2)” = e, we must have 2|m, because (1 2) is of order 2. 
This tells us that m must be divisible by 12. 

On the other hand, 


m=(1 2993 4 5 6)(7 8 9? =e. 


So 7 1s of order 12. 

Here, again, the special properties of 7 do not enter the picture. What 
we did for 7 works for any permutation. To formulate this properly, recall 
that the least common multiple of m and n is the smallest positive integer vu 
which is divisible by m and by n. (See Problem 7, Chapter 1, Section 5.) Then 
we have 
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Theorem 3.2.4. Let o € S, have its cycle decomposition into disjoint 
cycles of length m,, m,,..., m,. Then the order of o is the least common 
multiple of 71, ™m,..., Mx. 


Proof. Let o = 7,7,°°* 7, Where the 7; are disjoint cycles of length m;. 
Since the 7; are disjoint cycles, 7,7; = 7;7;; therefore if M is the least common 
multiple of m,, m,..., m,, then o” = (1,7, -°-- 7)" =7rir¥ --- 7 =e 
(since 7;’= e because 7; is of order m; and m;|M). Therefore, the order of 
go is at most M. On the other hand, if ao” = e, then rr’. -- 7 = e. This 
forces each TY = e, (prove!) because 7; are disjoint permutations, so m;| N, 
since 7; is of order m;. Thus JN is divisible by the least common multiple of 
m,, M,,..., m,, so M| N. Consequently, we see that o is of order M as 


claimed in the theorem. [] 


Note that the disjointness of the cycles in the theorem is imperative. 
For instance, (1 2) and (1 3), which are not disjoint, are each of order 2, 
but their product (1 2)(1 3)=(1 3. 2) is of order 3. 

Let’s consider Theorem 3.2.4 in the context of a card shuffle. Suppose 
that we shuffle a deck of 13 cards in such a way that the top card is put into 
the position of the 3rd card, the second in that of the 4th, ..., the ith into the 
i + 2 position, working mod 13. As a permutation, o, of 1, 2,..., 13, the 
shuffle becomes 


_(123 4567 8 9 10 11 12 13 
- 678 9 10 11 12 13 1 2?P 


and ois merely the 13-cycle(1 3 5 7 9 11 13 2 4 6 8 10 12), 
so a is of order 13. How many times must we repeat this shuffle to get the 
cards back to their original order? The answer is merely the order of o, that 
is, 13. So it takes 13 repeats of the shuffle to get the cards back to their origi- 
nal order. 

Let’s give a twist to the shuffle above. Suppose that we shuffle the 
cards as follows. First take the top card and put it into the second-to-last 
place, and then follow it by the shuffle given above. How many repeats are 
now needed to get the cards back to their original order? The first operation 
is the shuffle given by the permutation7=(1 12 11 10 9 8 7 6 5 
4 3 2) followed by a above. So we must compute o7 and find its order. 
But 


or=(1 3579 11 13 2 4 6 8 10 12) 
x(1 12 11 10 9 8 765 4 3 2) 
=(12 3 45678 9 10 11 12 13), 
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so is of order 12. So it would take 12 repeats of the shuffle to get back to the 
original order. 

Can you find a shuffle of the 13 cards that would require 42 repeats? Or 
20 repeats? What shuffle would require the greatest number of repeats, and 
what would this number be? 

We return to the general discussion. Consider the permutation 
(1 2 3); we see that (1 2 3) = (1 3)(1 2). We can also see that 
(1 2 3) = (2 3)(1 3). So two things are evident. First, we can write 
(1 2 3) as the product of two transpositions, and in at least two distinct 
ways. Given the k-cycle (i; i, -:: &), then (i, % ++: &) = 
(i; i (i, t-1) ++: (i, iy), So every k-cycle is a product of k — 1 transposi- 
tions (if k > 1) and this can be done in several ways, so not in a unique way. 
Because every permutation is the product of disjoint cycles and every cycle is 
a product of transpositions we have 


Theorem 3.2.5. Every permutation in S, is the product of transposi- 
tions. 


This theorem is really not surprising for it says, after all, nothing more 
or less than that any permutation can be effected by carrying out a series of 
interchanges of two objects at a time. 

We saw that there is a lack of uniqueness in representing a given per- 
mutation as a product of transpositions. But, as we shall see in Section 3, 
some aspects of this decomposition are indeed unique. 

As a final word of this section, we would like to point out the conve- 
nience of cycle notation. When we represent elements of a permutation 
group as products of disjoint cycles, many things become transparent—for 
example, the order of the permutation is visible at a glance. To illustrate this 
point, we now give a few examples of certain geometric groups, which are in 
fact permutation groups that have already appeared in Chapter 2 under dif- 
ferent guises. 


Examples 


1. Informally, a motion of a geometric figure is a permutation of its vertices 
that can be realized by a rigid motion in space. For example, there are eight 
motions of a square, whose vertices are numbered 1, 2, 3, 4 as below: 

4 3 
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a = (13) is the reflection about the axis of symmetry joining vertices 2 and 
4 in the original position, 
8B = (1234) is the counterclockwise rotation by 90°, 
B? = (13)(24) is the counterclockwise rotation by 180°, 
8° = (1432) is the counterclockwise rotation by 270°, 
af = (12)(34) is the reflection in the vertical axis of symmetry, 
aB* = (24) is the reflection in the other diagonal axis, 
af? = (14)(23) is the reflection in the horizontal axis, and, of course 


a’ = B* = (1) is the “motion” that leaves the vertices unchanged. 


We also have the relation Ba = af. 


These motions, or symmetries of a square, form a subgroup of S, which is 
called the octic group, or the dihedral group of order 8. This group (or, 
strictly speaking, a group isomorphic to it) was introduced in Example 9 of 
Section 2.1 without mention of permutations. 


2. There are only four symmetries of a non-square rectangle: 


the reflections in the two axes of symmetry, rotation by 180° and the identity. 
These motions can be identified with permutations (1), (14)(23), (12)(34), 
(13)(24), and form a subgroup of the group obtained in Example 1. This sub- 
group is often called Klein’s 4-group. 


3. We leave it to the reader to verify that the group of all motions of an 
equilateral triangle is the full symmetric group S3. 


3 


1 2 


4. The motions of a regular hexagon form the dihedral group of order 12, 
generated by the permutations a = (15)(24), corresponding to a reflection 
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about one of the axes of symmetry, and B = (123456), corresponding to the 
counterclockwise rotation by 60°. 


1 2 


In general, the dihedral group of order 2n, which was first introduced in Ex- 
ample 10 of Section 2.1, can be interpreted as the group of symmetries of a 
regular n-gon (a polygon with n edges of equal length). 


PROBLEMS 


Easier Problems 


1. Show that if o, 7 are two disjoint cycles, then or = 7a. 


2. Find the cycle decomposition and order. 


(a) ( 
1 
 (! 


(c) ( 


3 67 8 4 


9 8 5 
7 
i} 


VAL 2 3 4 3 oO: 7 
W\2 3 15 67 4/7 


NN NAN FN 
+, fF FR NS 
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3. Express as the product of disjoint cycles and find the order. 


(a) (1 
(b) (1 
(c) (1 
(d) (1 
(e) (1 
(f) (1 


23 5 72 4 7 6). 

2)(1 3)(1 4). 

23 4 51 23 4 61 2 3 4 7). 
2 3)\(1 3 2). 

2 33 5 7 91 2 3)74 

23 4 5) 


4. Give a complete proof of Theorem 3.2.2. 

5. Show that a k-cycle has order k. 

6. Find a shuffle of a deck of 13 cards that requires 42 repeats to return the 
cards to their original order. 
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7. Do Problem 6 for a shuffle requiring 20 repeats. 

8. Express the permutations in Problem 3 as the product of transpositions. 

9. Given the two transpositions (1 2) and (1 3), find a permutation a 
such that o(1_ 2)o7' = (1 3). 

10. Prove that there is no permutation a such that 0(1 2)o°'=(1 2 3). 

11. Prove that there is a permutation o such that o(1 2 3)0' = 
(4 5 6). 

12. Prove that there is no permutation o such that o(1 2 3)o0' = 
(1 2 4)(5 6 7). 


Middle-Level Problems 


13. Prove that (1 2) cannot be written as the product of disjoint 3- 


cycles. 

14, Prove that for any permutation o, ora! is a transposition if 7 is a trans- 
position. 

15. Show that if 7 is a k-cycle, then ora! is also a k-cycle, for any permuta- 
tion a. 


16. Let ® be an automorphism of S;. Show that there is an element 0 € S; 
such that ®(7) = o'ro for every TE S3. 

17. Let(1 2)and(1 2 3 -::: n)beinS,. Show that any subgroup of S, 
that contains both of these must be all of S, (so these two permutations 
generate S,). 

18. If 7, and 7, are two transpositions, show that 7,7, can be expressed as the 
product of 3-cycles (not necessarily disjoint). 

19. Prove that if 7,, 7,, and 73 are transpositions, then 7,773 # e, the identity 
element of S,. 

20. If 7,, 7. are distinct transpositions, show that 7,7, is of order 2 or 3. 

21. If o, 7 are two permutations that disturb no common element and ot = e, 
prove that 0 = T= e. 

22. Find an algorithm for finding oro ' for any permutations g, 7 of S,. 

23. Let o, tr be two permutations such that they both have decompositions 
into disjoint cycles of cycles of lengths m,, mz,..., m,. (We say that 
they have similar decompositions into disjoint cycles.) Prove that for 
some permutation p, T = pop’. 

24. Find the conjugacy class in S, of (1 2 --- n). What is the order of the 
centralizerof(1 2 --: n)inS,? 

25. Do Problem 24 forg=(1 2)(3 4). 
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3. ODD AND EVEN PERMUTATIONS 


We noticed in Section 2 that although every permutation is the product of trans- 
positions, this decomposition is not unique. We did comment, however, that cer- 
tain aspects of this kind of decomposition are unique. We go into this now. 

Let’s consider the special case of S3, for here we can see everything ex- 
plicitly. Let f(x,, x2, x3) = (x1 — X2)(x1 — X3)(%2 — x3) be an expression in 
the three variables x,, x,, x3. We let S3 act on f(x) = f(x 1, x», x3) as follows. 
If 7 © $3, then 


o*(f(x)) = (Xo) = X2)) (Xo) — Xo(3) (Xe) ae Xo(3)): 
We consider what o* does to f(x) for a few of the o’s in S3. 
Consider 0 = (1 2). Then o(1) = 2, o(2) = 1, and o(3) = 3, so that 
o*(f(x)) = (X6(1) — Xo(2)) (Xo1) = Xo(3))(Xox2) — Xg(3)) 
= (X2 — X41) (%2 — X3)(%1 — 5) 
= —(%, — X2)(%1 — X3)(X2 — 3) 
= —f(x). 
So a* coming from a = (1 2) changes the sign of f (x). Let’s look at the ac- 
tion of another element, 7= (1 2 3), of S; on f(x). Then 
T*(F(X)) = ray 7 Xr) Xray — Xray) x2) — X23) 
= (X2 — X3)(X2 — X1)(%3 — *4) 
= (X41 — X2)(%1 — X3)(X2 — Xs) 
= f(x), 


so T* coming from t = (1 2. 3) leaves f(x) unchanged. What about the 
other permutations in $3; how do they affect f(x)? Of course, the identity el- 
ement e induces a map e* on f(x) which does not change f(x) at all. What 
does 7’, for tr above, do to f(x)? Since r*f(x) = f(x), we immediately see that 


(7°) * (f(x)) = (X2(1) = X,2(2)) (X,2(1) as X,2(3)) (X,2(2) a X,2(3)) 
= f(x). (Prove!) 


Now consider or = (1 2)(1 2 3) = (2 3); since 7 leaves f(x) alone and o 
changes the sign of f(x), o7 must change the sign of f(x). Similarly, (1 3) changes 
the sign of f(x). We have accounted for the action of every element of S; on f(x). 
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Suppose that p € S$; is a product p = 7,7,::-7, of transpositions 
T1,..., 7%; then p acting on f(x) will change the sign of f(x) k times, since 
each 7, changes the sign of f(x). So p*(f(x)) = (—1)‘f (x). If p = a0 -++ o%, 
where o),..., oO, are transpositions, by the same reasoning, p*(f(x)) = 
(—1)'f(x). Therefore, (—1)‘ f(x) = (—1)'f(x), whence (-1)' = (-1)*. This 
tells us that t and k have the same parity; that is, if t is odd, then k must be 
odd, and if ¢ is even, then A must be even. 

This suggests that although the decomposition of a given permutation o 
as a product of transposition is not unique, the parity of the number of trans- 
positions in such a decomposition of o might be unique. 

We strive for this goal now, suggesting to readers that they carry out 
the argument that we do for arbitrary n for the special case n = 4. 

As we did above, define f(x) = f(x,,...,%,) to be 


fa a) = a) ey 
=Ile, — Xj); 


where in this product i takes on all values from 1 to n — 1 inclusive, and j all 
those from 2 to n inclusive. If o € S,, define a* on f(x) by 


o*(f(x)) = L Gow — Xo(j)): 


Ifo, 7E S,, then 


(o7)*(f(x)) = U (Xena) ~ Xen) = o* (I (X21) — *))) 


o G (I ieee «)) = o*(r*(f(x))) = (o*r*)(f(8)) 


So (aT)* = o*r* when applied to f(x). 

What does a transposition 7 do to f(x)? We claim that 7*(f(x)) = 
—f(x). To prove this, assuming that 7 = (i /) where i < j, we count up the 
number of (x, — x,), with u < v, which get transformed into an (x, — x,) with 
a > b. This happens for (x, — x;) if i <u <j, for (x; — x,) ifi << vu <j, and fi- 
nally, for (x; — x;). Each of these leads to a change of sign on f(x) and since 
there are 2(j — i — 1) + 1 such, that is, an odd number of them, we get an 
odd number of changes of sign on f(x) when acted on by 7*. Thus 
7*(f(x)) = —f(x). Therefore, our claim that 7*(f(x)) = —f(x) for every 
transposition 7 is substantiated. 
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If o is any permutation in S, and 0 = 7,7, °-: ™%, Where 7, 7),..., T; 
are transpositions, then o* = (7,7) -+-+ T,)* = Tf7r*¥ --- 7* as acting on f(x), 
and since each r*( f(x)) = —f(x), we see that o*(f(x)) = (—1)*f(x). Simi- 
larly, if o = ¢,¢,:-:@,, where 2), ¢,..., &, are transpositions, then 
o*(f(x)) = (—1)'f(x). Comparing these two evaluations of o*(f(x)), we 
conclude that (—1)* = (—1)’. So these two decompositions of o as the prod- 
uct of transpositions are of the same parity. Thus any permutation is either 
the product of an odd number of transpositions or the product of an even 
number of transpositions, and no product of an even number of transpositions 
can equal a product of an odd number of transpositions. 

This suggests the following 


Definition. The permutation o € S, is an odd permutation if a is the 
product of an odd number of transpositions, and is an even permutation if o 
is the product of an even number of transpositions. 

What we have proved above is 


Theorem 3.3.1. A permutation in S, is either an odd or an even per- 
mutation, but cannot be both. 


With Theorem 3.3.1 behind us we can deduce a number of its conse- 
quences. 

Let A,, be the set of all even permutations; if 0, 7 € A,, then we imme- 
diately have that o7 € A,,. Since A,, is thus a finite closed subset of the (fi- 
nite) group S,, A, is a subgroup of S,, by Lemma 2.3.2. A,, is called the alter- 
nating group of degree n. 

We can show that A, is a subgroup of S, in another way. We already 
saw that A, is closed under the product of S,, so to know that A, is a sub- 
group of S,, we merely need show that o € S, implies that o' € S,,. For any 
permutation o we claim that a and a’ are of the same parity. Why? Well, if 
O = 7|T,'** 7, where the 7; are transpositions, then 


-1 o Tg ee es GR OS ee ice 
Ce = (77> 7) = Th Tr-1 TM Ty = TK TK-1 7271 5 


since 7; ' = 7,;. Therefore, we see that the parity of a and a“! is (—1)*, so 
they are of equal parity. This certainly shows that o € A, forces 0 ' € A,, 
whence A,, is a subgroup of S,,. 

But it shows a little more, namely that A, is a normal subgroup of S,,. 
For suppose that 0 € A, and p € S,. What is the parity of p-'op? By the 
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above, p and p ‘ are of the same parity and o is an even permutation so p ‘op 
is an even permutation, hence is in A,,. Thus A, is a normal subgroup of S,,. 
We summarize what we have done in 


Theorem 3.3.2. A,,, the alternating group of degree n, is a normal 
subgroup of S,,. 


We look at this in yet another way. From the very definitions involved 
we have the following simple rules for the product of permutations: 


1. The product of two even permutations Is even. 
2. The product of two odd permutations is even. 


3. The product of an even permutation by an odd one (or of an odd one 
by an even one) is odd. 


If o is an even permutation, let 0(a) = 1, and if o is an odd permuta- 
tion, let 6(0) = —1. The foregoing rules about products translate into 
6(a7) = 0(c)60(7), So 0 is a homomorphism of S,, onto the group FE = {1, —1} 
of order 2 under multiplication. What is the kernel, N, of 6? By the very defi- 
nition of A, we see that N = A,,. So by the First Homomorphism Theorem, 
E = §,/A,,. Thus 2 = |E| = |S,/A,| = |S,|/|A,|, if n > 1. This gives us that 
|A,| = 21S,| = ant. 

Therefore, 


Theorem 3.3.3. For n> 1, A,, is anormal subgroup of S, of order 37!. 


Corollar e For n > 1, S contains in! even permutations and in! odd 
n 2 2 
permutations. 


A final few words about the proof of Theorem 3.3.1 before we close 
this section. Many different proofs of Theorem 3.3.1 are known. Quite 
frankly, we do not particularly like any of them. Some involve what might be 
called a “collection process,” where one tries to show that e cannot be writ- 
ten as the product of an odd number of transpositions by assuming that it is 
such a shortest product, and by the appropriate finagling with this product, 
shortening it to get a contradiction. Other proofs use other devices. The 
proof we gave exploits the gimmick of the function f(x), which, in some 
sense, is extraneous to the whole affair. However, the proof given is probably 
the most transparent of them all, which is why we used it. 

Finally, the group A,,, for n = 5, is an extremely interesting group. We 
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shall show in Chapter 6 that the only normal subgroups of A,, for n = 5, are 
(e) and A,, itself. A group with this property is called a simple group (not to 
be confused with an easy group). The abelian finite simple groups are merely 
the groups of prime order. The A, for n = 5 provide us with an infinite fam- 
ily of nonabelian finite simple groups. There are other infinite families of finite 
simple groups. In the last 20 years or so the heroic efforts of algebraists have 
determined all finite simple groups. The determination of these simple groups 
runs about 10,000 printed pages. Interestingly enough, any nonabelian finite 
simple group must have even order. 


PROBLEMS 
Easier Problems 


1. Find the parity of each permutation. 


@ (i 23456789 
BPN AG. Bo 7 Re Gy 


(b) (1 2 3 4 5 6)7 8 9). 
() (1 2 3 4 5 61 2 3 4 5 7). 
(d) (1 2)4 2 3)4 S5)S 6 8)1 7 9). 
2. If ais a k-cycle, show that o is an odd permutation if k is even, and is an 
even permutation if k 1s odd. 
3. Prove that o and 7 'o7, for any o, TE S,,, are of the same parity. 


4. If m <n, we can consider S,, C S, by viewing o € S, as acting on 
1,2,..., m,..., n as it did on 1, 2,..., m and o leaves j > mi fixed. 
Prove that the parity of a permutation in S,,, when viewed this way as an 
element of S,,, does not change. 


5. Suppose you are told that the permutation 


i -2. 3-4 5. 6. 7 3 9 
Ss £-2 7 8 9 6 


in Sg, where the images of 5 and 4 have been lost, is an even permuta- 
tion. What must the images of 5 and 4 be? 


Middle-Level Problems 


6. If n = 3, show that every element in A, is a product of 3-cycles. 
7. Show that every element in A,, is a product of n-cycles. 
8. Find a normal subgroup in A, of order 4. 
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Harder Problems (In fact, very hard) 


9. If n = 5 and (e) # N CA, is a normal subgroup of A,,, show that N must 
contain a 3-cycle. 

10. Using the result of Problem 9, show that if n = 5, the only normal sub- 
groups of A, are (e) and A, itself. (Thus the groups A,, for n = 5 give us 
an infinite family of nonabelian finite simple groups.) 


4 


RING THEORY 


1. DEFINITIONS AND EXAMPLES 


So far in our study of abstract algebra, we have been introduced to one kind 
of abstract system, which plays a central role in the algebra of today. That 
was the notion of a group. Because a group Is an algebraic system with only 
one operation, and because a group need not satisfy the rule ab = ba, it ran 
somewhat counter to our prior experience in algebra. We were used to sys- 
tems where you could both add and multiply elements and where the ele- 
ments did satisfy the commutative law of multiplication ab = ba. Further- 
more, these systems of our acquaintance usually came from sets of 
numbers—integers, rational, real, and for some, complex. 

The next algebraic object we shall consider 1s a ring. In many ways 
this system will be more reminiscent of what we had previously known 
than were groups. For one thing rings will be endowed with addition and 
multiplication, and these will be subjected to many of the familiar rules 
we all know from arithmetic. On the other hand, rings need not come 
from our usual lumber systems, and, in fact, usually have little to do with 
these familiar ones. Although many of the formal rules of arithmetic 
hold, many strange—or what may seem as strange—phenomena do take 
place. As we proceed and see examples of rings, we shall see some of 
these things occur. 

With this preamble over we are ready to begin. Naturally enough, the 
first thing we should do is to define that which we’ll be talking about. 


125 
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Definition. A nonempty set R is said to be a ring if in R there are two 
operations + and - such that: 


(a) a,b © Rimplies thata + bE R. 

(b) a+b=b+afora,beER. 

(c) (a+b) +c=a+t (b+ c)fora,b,cER. 

(d) There exists an element 0 € R such that a + 0 = a for everya E R. 


(e) Given a € R, there exists a b € R such that a + b = 0. (We shall write 
b as —a.) 


Note that so far all we have said is that R is an abelian group under +. We 
now spell out the rules for the multiplication in R. 


(f) a,b © Rimplies thata- bE R. 
(g) a-(b-c) =(a-b)-cfora,b,cER. 


This is all that we insist on as far as the multiplication by itself is concerned. 
But the + and - are not allowed to live in solitary splendor. We interweave 
them by the two distributive laws 


(h) a:(b+c)=a-b+a:c and 
(b+c):a=b-a+t+c-:a,fora,b,cER. 


These axioms for a ring look familiar. They should be, for the concept 
of ring was introduced as a generalization of what happens in the integers. 
Because of Axiom (g), the associative law of multiplication, the rings we de- 
fined are usually called associative rings. Nonassociative rings do exist, and 
some of these play an important role in mathematics. But they shall not be 
our concern here. So whenever we use the word “ring” we shall always mean 
“associative ring.” 

Although Axioms (a) to (h) are familiar, there are certain things they 
do not say. We look at some of the familiar rules that are not insisted upon 
for a general ring. 

First, we do not postulate the existence of an element 1 € R such that 
a:1 =1-a =a for every a € R. Many of the examples we shall encounter 
will have such an element, and in that case we say that R is a ring with unit. 
In all fairness we should point out that many algebraists do demand that a 
ring have a unit element. We do insist that 1 # 0; that is, the ring consisting 
of 0 alone is not a ring with unit. 
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Second, in our previous experience with things of this sort, whenever 
a:b = 0 we concluded that a = 0 or b = 0. This need not be true, in general, 
in a ring. When it does hold, the ring is kind of nice and is given a special 
name; it is called a domain. 

Third, nothing is said in the axioms for a ring that will imply the com- 
mutative law of multiplication a:b = b-a. There are noncommutative rings 
where this law does not hold; we shall see some soon. Our main concern in 
this chapter will be with commutative rings, but for many of the early results 
the commutativity of the ring studied will not be assumed. 

As we mentioned above, some things make certain rings nicer than oth- 
ers, and so become worthy of having a special name. We quickly give a list of 
definitions for some of these nicer rings. 


Definition. A commutative ring R is an integral domain if a: b = 0 in 
R implies that a = 0 or b = 0. 


It should be pointed out that some algebra books insist that an integral 
domain contain a unit element. In reading another book, the reader should 
check if this is the case there. The integers, Z, give us an obvious example of 
an integral domain. We shall see other, somewhat less obvious ones. 


Definition. A ring R with unit is said to be a division ring if for every 
a # 0 in R there is an element b € R (usually written as a‘) such that 
a-a'=a'-a=1. 

The reason for calling such a ring a division ring is quite clear, for we 
can divide (at least keeping left and right sides in mind). Although noncom- 
mutative division rings exist with fair frequency and do play an important 
role in noncommutative algebra, they are fairly complicated and we shall 
give only one example of these. This division ring 1s the great classic one in- 
troduced by Hamilton in 1843 and is known as the ring of quaternions. (See 
Example 13 below.) 

Finally, we come to perhaps the nicest example of a class of rings, the field. 


Definition. A ring R is said to be a field if R 1s a commutative division 
ring. 


In other words, a field is a commutative ring in which we can divide 
freely by nonzero elements. Otherwise put, R is a field if the nonzero ele- 
ments of R form an abelian group under -, the product in R. 
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For fields we do have some ready examples: the rational numbers, the 
real numbers, the complex numbers. But we shall see many more, perhaps 
less familiar, examples. Chapter 5 will be devoted to the study of fields. 

We spend the rest of the time in this section looking at some examples 
of rings. We shall drop the - for the product and shall write a: b simply as ab. 


Examples 


1. It is obvious which ring we should pick as our first example, namely Z, the 
ring of integers under the usual addition and multiplication of integers. Natu- 
rally enough, Z is an example of an integral domain. 


2. The second example is equally obvious as a choice. Let @ be the set of all 
rational numbers. As we all know, @ satisfies all the rules needed for a field, 
so Q is a field. 


3. The real numbers, R, also give us an example of a field. 
4. The complex numbers, C, form a field. 


Note that @ C R C C; we describe this by saying that Q is a subfield of 
R (and of C) and R is a subfield of C. 


5. Let R = Z,, the integers mod 6, with the addition and the multiplication 
defined by [a] + [b] = [a + b] and [a][b] = [ab]. 

Note that [0] is the 0 required by our axioms for a ring, and [1] is the unit 
element of R. Note, however, that Z, is not an integral domain, for 
[2][3] = [6] = [0], yet [2] 4+ [0] and [3] + [0]. R is a commutative ring with unit. 


This example suggests the 


Definition. An element a # 0 ina ring R is a zero-divisor in R if ab = 0 
for some b # Oin R. 


We should really call what we defined a left zero-divisor; however, 
since we shall mainly talk about commutative rings, we shall not need any 
left-right distinction for zero-divisors. 

Note that both [2] and [3] in Z, are zero-divisors. An integral domain 
is, of course, a commutative ring without zero-divisors. 


6. Let R = Z,, the ring of integers mod 5S. R is, of course, a commutative ring 
with unit. But it is more; in fact, it is a field. Its nonzero elements are [1], [2], 
[3], [4] and we note that [2][3] = [6] = [1], and [1] and [4] are their own in- 
verses. So every nonzero element in Z, has an inverse in Z,. 

We generalize this to any prime p. 


Sec. 1 Definitions and Examples 129 


7. Let Z, be the integers mod p, where p is a prime. Again Z, is clearly a 
commutative ring with 1. We claim that Z, is a field. To see this, note that if 
[a] # [0], then p/ a. Therefore, by Fermat’s Theorem (Corollary to Theorem 
2.4.8), a’~' = 1(p). For the classes [-] this says that [a?~'] = [1]. But [a?~'] = 
[a]’~', so [a]?~' = [1]; therefore, [a]’~* is the required inverse for [a] in Z,, 
hence Z, is a field. 

Because Z, has only a finite number of elements, it is called a finite 
field. Later we shall construct finite fields different from the Z,’s. 


8. Let @ be the rational numbers; if a € @, we can write a = m/n, where m 
and n are relatively prime integers. Call this the reduced form for a. Let R be 
the set of all a € @ in whose reduced form the denominator is odd. Under 
the usual addition and multiplication in @ the set R forms a ring. It is an inte- 
gral domain with unit but is not a field, for 3, the needed inverse of 2, is not 
in R. Exactly which elements in R do have their inverses in R? 


9. Let R be the set of all a © @ in whose reduced form the denominator is 
not divisible by a fixed prime p. As in (8), R is a ring under the usual addi- 
tion and multiplication in Q, is an integral domain but is not a field. What 
elements of R have their inverses in R? 


Both Examples 8 and 9 are subrings of @ in the following sense. 


Definition. If R is a ring, then a subring of R is a subset S of R which 
is a ring if the operations ab and a + Db are just the operations of R applied to 
the elements a, b € S. 


For S to be a subring, it is necessary and sufficient that S be nonempty 
and that ab,a + b € S for all a, b € S. (Prove!) 

We give one further commutative example. This one comes from the 
calculus. 


10. Let R be the set of all real-valued continuous functions on the closed unit 
interval [0, 1]. For f, g © R and x € (0, 1] define (f + g)(x) = f(x) + g(x), 
and ( f: g)(x) = f(x)g (x). From the results in the calculus, f + g and f-: g are 
again continuous functions on [0, 1]. With these operations R is a commuta- 
tive ring. It is not an integral domain. For instance, if f(x) = —x + § for 
0 <x <3 and f(x) = 0 for§ < x < 1, and if g(x) = 0 for0 = x = $ and 
g(x) = 2x —1for3 <x <1, then f, g € R and, as is easy to verify, f- g = 0. 
It does have a unit element, namely the function e defined by e(x) = 1 for all 
x € [0, 1]. What elements of R have their inverses in R? 
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We should now like to see some noncommutative examples. These are 
not so easy to come by, although noncommutative rings exist in abundance, 
because we are not assuming any knowledge of linear algebra on the reader’s 
part. The easiest and most natural first source of such examples is the set of 
matrices over a field. So, in our first noncommutative example, we shall 
really create the 2 X 2 matrices with real entries. 


11. Let F be the field of real numbers and let R be the set of all formal square 


arrays 
a b 
c ad 


where a, b, c, d are any real numbers. For such square arrays we define addi- 
tion in a natural way by defining 


a, b, 4 (% b,\ _ [a, + a, b, + by 
Cc, d, c, d, C, +c d,+d,] 


It is easy to see that R forms an abelian group under this + with t ) act- 


ing as the zero element and & = the negative of 5 A} To make of 


R aring, we need a multiplication. We define one in what may seem a highly 
unnatural way via 


a b\(ir s\_ far+ bt as + bu 
c dj\\t u cr+ dt cs+du] 
It may be a little laborious, but one can check that with these operations R is a 


noncommutative ring with (( 4 acting as its multiplicative unit element. 


Note that 

1 O\/O O\  /0 O 

0 O/\1 O} \0O O 
while 

0 O\/1 O\ /0 O 

1 OVO OF \1 «OP 
sO 
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1 O 0 0 _ -§ 
Note that t 4 and t 4 are zero-divisors; in fact, 


OV -0%) ww 02 


is a nonzero element whose square 1s the 0 element of R. This R is known as 
the ring of all 2 X 2 matrices over F, the real field. 

For those unfamiliar with these matrices, and who see no sense in the 
product defined for them, let’s look at how we do compute the product. To 
get the top left entry in the product AB, we “multiply” the first row of A by 
the first column of B, where A, B € R. For the top right entry, it is the first 
row of A versus the second column of B. The bottom left entry comes from 
the second row of A versus the first column of B, and finally, the bottom 
right entry is the second column of A versus the second column of B. 

We illustrate with an example: Let 


a. “a . = 
A=(3 : and a=(8 5) 


Then the first row of A is 1,5 and the first column of B is §, 7; we “multiply” 
these via 1-4 + $+ m= m/2 + 3, and so on. So we see that 


In the problems we shall have many matrix multiplications, so that the 
reader can acquire some familiarity with this strange but important example. 


12. Let R be any ring and let 


sie) 


with + and - as defined in Example 11. One can verify that S is a ring, also, 
under these operations. It is called the ring of 2 X 2 matrices over R. 


ab,c.d € R} 


Our final example is one of the great classical examples, the real qua- 
ternions, introduced by Hamilton (as a noncommutative parallel to the com- 
plex numbers). 


13. The quaternions. Let F be the field of real numbers and consider the set 
of all formal symbols ay + a,i + a,j + a3k, where ao, a1, a, a3 € F. 
Equality and addition of these symbols are easy, via the obvious route 
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A + ai + aj + a3k = By + Bi + Boj + Bsk 
if and only if ag = Bo, a; = Bi, @2 = B, and a; = B3, and 
(A + ayi + aj + a3k) + (Bo + Bi + Boj + B3k) 
= (a + Bo) + (a, + Bi) i + (ay + B)j + (a3 + B3)k. 


We now come to the tricky part, the multiplication. When Hamilton discov- 
ered it on October 6, 1843, he cut the basic rules of this product out with his 
penknife on Brougham Bridge in Dublin. The product is based on i? = j* = 
k? = -1,ij = k, jk = i, ki = j and ji = —k, kj = —i, ik = —j. If we go around 
the circle clockwise 


i 
‘C) 
the product of any two successive ones is the next one, and going around 
counterclockwise we get the negatives. 


We can write out the product now of any two quaternions, according to 
the rules above, declaring by definition that 


(a + ai + ayj + a3k)(Bo + Bi + Boj + B3k) 
=% + Ni + YJ + yak, 
where 
Yo = &PBo — 218, — Py — a383 
Vi = AB, + @Bo + aP3 — a3, (I) 
Y2 = AB, — 83 + aBo + asB, 
¥3 = AB3 + a@,B, — AB, + a3Bo 


It looks horrendous, doesn’t it? But it’s not as bad as all that. We are 
multiplying out formally using the distributive laws and using the product 
rules for the i, j, k above. 

If some a; is 0 in xX = ay + al + aj + a3k, we shall omit it in express- 
ing x; thus 0 + Oz + Oj + Ok will be written simply as 0, 1 + O72 + Oj + OK as 1, 
O + 31 + 47 + Ok as 3i + 4), and so on. 

A calculation reveals that 


(a + ai + ajt+agk)(ay — ai — aj — a3k) (II) 


=a + at + ab + a. 
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This has a very important consequence; for suppose that x = ay + ayi + 
a,j + a,k # 0 (so some a; # 0). Then, since the a’s are real, B = a} + at + 
a3 + a3 # 0. Then from (II) we easily get 


a 


(a + ai taj t+ anh = B! = p! — > = 1. 


So, if x # 0, then x has an inverse in the quaternions. Thus the quaternions 
form a noncommutative division ring. 


Although, as we mentioned earlier, there is no lack of noncommutative 
division rings, the quaternions above (or some piece of them) are often the 
only noncommutative division rings that even many professional mathemati- 
cians have ever seen. 

We shall have many problems—some easy and some quite a bit 
harder—about the two examples: the 2 X 2 matrices and the quaternions. 
This way the reader will be able to acquire some skill with playing with 
noncommutative rings. 

One final comment in this section: If yo, y;, y2, y3 are as in (I), then 


(a + at + a + 08)(B5 + Bi + B+ B3) (IIT) 
= 6+ vit v2 + ¥3- 


This is known as Lagrange’s Identity; it expresses the product of two sums of 
four squares again as a sum of four squares. Its verification will be one of the 
exercises. 


PROBLEMS 


Easier Problems 


*1. Find all the elements in Z,, that are invertible (1.e., have a multiplicative 
inverse) in Z5,. 

. Show that any field is an integral domain. 

. Show that Z, is a field if and only if 7 is a prime. 

. Verify that Example 8 is a ring. Find all its invertible elements. 

- Do Problem 4 for Example 9. 

. In Example 11, the 2 X 2 matrices over the reals, check the associative 
law of multiplication. 


Nn hm & NW 
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7. Work out the following: 


Eo 2\rS. 
(a) ; It ') 


a b\f1 0O 1 O\f/a b 
(d) t At ; et Oy “} 
a b a b\f1 O\_ /1 O\fa b 
8. Find all matrices : 2 such that (‘ AF = (( 4 b ’) 


9. Find all 2 < 2 matrices (‘ 4 that commute with all 2 < 2 matrices. 


10. Let R be any ring with unit, S the ring of 2 2 matrices over R. (See Ex- 


ample 12.) 
(a) Check the associative law of multiplication in S. (Remember: R need 


not be commutative.) 


a b 
(b) Show that (6 7 


a 
0 


a,b,c, € R| is a subring of S. 


(c) Show that ( 4 has an inverse in S if and only if a and c have 


~1 
inverses in R. In that case write down ki : explicitly. 


11. Let F: C — C be defined by F(a + bi) = a — bi. Show that: 
(a) F(xy) = F(x)F(y) forx, y € C. 
(b) F(xx) = |x\?. 
(c) Using Parts (a) and (b), show that 
(a? + b*)(c? + d’) = (ac — bd) + (ad + bc)’. 
[Note: F(x) is merely X.] 
12. Verify the identity in Part (c) of Problem 11 directly. 
13. Find the following products of quaternions. 
(a) (+s) — J). 
(b) (1 —2 4+ 2] — 2k)(1 + 27 — 4) + 6k). 
(c) (2i — 3j + 4k)’. 
(d) i(a 9 + ayi + aj + ask) — (ao + Qi + Aaj + a3k)i. 
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14. Show that the only quaternions commuting with i are of the form a + fi. 
15. Find the quaternions that commute with both i and /. 
16. Verify that 


(a) + ai + aj + azk)(a — ai — a,j — azk) 


=a, t+ aj, + ast as. 
17. Verify Lagrange’s Identity by a direct calculation. 


Middle-Level Problems 


18. In the quaternions, define 


|ay + ayi + Qj + azk| = Vaz + a2 + 03 + af. 


Show that |xy| = |x||y| for any two quaternions x and y. 
19. Show that there is an infinite number of solutions to x* = —1 in the 
quaternions. 


20. In the quaternions, consider the following set G having eight elements: 
G={F1. Fie Sk. 
(a) Prove that G is a group (under multiplication). 
(b) List all subgroups of G. 
(c) What is the center of G? 
(d) Show that G is a nonabelian group all of whose subgroups are 
normal. 


21. Show that a division ring is a domain. 
22. Give an example, in the quaternions, of a noncommutative domain that 
is not a division ring. 


23. Define the map * in the quaternions by 
(a aE a1 alr > J+ a3k)* = (a _ ayl i A>] . a3k). 


Show that: 
(a) x** = (x*)* = x, 
(b) (x + y)* = x* + y*. 
(c) xx* = x*x 1s real and nonnegative. 
(d) (xy)* = y*x*. 
[Note the reversal of order in Part (d).] 
24. Using *, define |x| = Vxx*. Show that |xy| = |x||y| for any two qua- 
ternions x and y, by using Parts (c) and (d) of Problem 23. 
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25. 


26. 


27. 


28. 


29. 


30. 
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Use the result of Problem 24 to prove Lagrange’s Identity. 


In Problems 26 to 30, let R be the 2 X 2 matrices over the reals. 


If (‘ J E R, show that (‘ 


a A is invertible in R if and only if ad — bc # 0. 


d 


—I1 
In that case find (‘ 4 


Define det(2 4 = ad — bc. For x, y © R show that det(xy) = 


(det x)(det y). 


Show that {x € R| det x # 0} forms a group, G, under matrix multiplica- 
tion and that N = {x € R| det x = 1} is a normal subgroup of G. 


If x € Ris a zero-divisor, show that det x = 0, and, conversely, if x # 0 is 
such that det x = 0, then x is a zero-divisor in R. 


In R, show that ( . 4 
—b a 


a,b real is a field. 


Harder Problems 


31. 


32. 
33. 


34. 


35. 


Let R be the ring of all 2 X 2 matrices over Z,, p a prime. Show that if 


det| “ a ad — bc # 0, then ae is invertible in R. 
c ad c d 


Let R be as in Problem 31. Show that for x, y € R, det(xy) = det(x) det(y). 


Let G be the set of elements x in the ring R of Problem 31 such that 
det(x) # 0. 

(a) Prove that G is a group. 

(b) Find the order of G. (Quite hard) 

(c) Find the center of G. 

(d) Find a p-Sylow subgroup of G. 

Let T be the group of matrices A with entries in the field Z, such that det A 
is not equal to 0. Prove that T is isomorphic to $3, the symmetric group 
of degree 3. 

For R as in Example 10, show that S = {f € R| fis differentiable on (0, 1)} 
is a subring of R which is not an integral domain. 


If F is a field, let H(F) be the ring of quaternions over F, that is, the 
set of all ag + ai + a,j + a3k, where ao, a,, a2, a, © F and where equal- 
ity, addition, and multiplication are defined as for the real quaternions. 
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36. If F = C, the complex numbers, show that H(C) is not a division ring. 

37. In H(C), find an element x # 0 such that x” = 0. 

38. Show that H(F) is a division ring if and only if af + aj +a5 + a = 0 
for a), @2, @3, a, in F forces ay = a, = a, = a, = 0. 

39. If Q is the field of rational numbers, show that H(Q) is a division ring. 

40. Prove that a finite domain is a division ring. 


41. Use Problem 40 to show that Z, is a field if p is a prime. 


2. SOME SIMPLE RESULTS 


Now that we have seen some examples of rings and have had some experi- 
ence playing around with them, it would seem wise to develop some compu- 
tational rules. These will allow us to avoid annoying trivialities that could 
beset a calculation we might be making. 

The results we shall prove in this section are not very surprising, not 
too interesting, and certainly not at all exciting. Neither was learning the al- 
phabet, but it was something we had to do before going on to bigger and bet- 
ter things. The same holds for the results we are about to prove. 

Since a ring R is at least an abelian group under +, there are 
certain things we know from our group theory background, for instance, 
—(-a) = a, —(a + b) = (-a) + (—b); ifa + b =a +c, then b = c, and 
so on. 

We begin with 


Lemma 4.2.1. Let R be any ring and let a, b © R. Then 


(a) a0 = 0a = 0. 

(b) a(—b) = (-a)b = —(ab). 
(c) (—a)(—b) = ab. 

(d) If1 © R, then (—1)a = —a. 


Proof. We do these in turn. 

(a) Since 0 = 0 + 0, a0 = a(0 + 0) = a0 + a0, hence a0 = 0. We have 
used the left distributive law in this proof. The right distributive law gives 
Oa = 0. 

(b) ab + a(—b) = a(b + (—b)) = a0 = 0 from Part (a). Therefore, 
a(—b) = —(ab). Similarly, (—a)b = —(ab). 

(c) By Part (b), (—a)(—b) = —((—a)b) = —(—(ab)) = ab, since we are 
in an abelian group. 
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(d) If 1 © R, then (—1)a + a = (—1)a + (1)a = (—1 + 1)a = 0a = 0. So 
(—1)a = —a by the definition of —a. 0 


Another computational result. 


Lemma 4.2.2. In any ring R, (a + b) = a’? +b’? + ab + bafora,bER. 


Proof. This is clearly the analog of (a + 8)’ = a’ + 2aB8 + B? in the in- 
tegers, say, but keeping in mind that R may be noncommutative. So, to it. By 
the right distributive law (a + b)? = (a+ b)(a + b) = (a+ b)at+ (a+ b)b= 
a’ + ba + ab + b’, exactly what was claimed. [1] 


Can you see the noncommutative version of the binomial theorem? Try 
it for (a + by’. 

One curiosity follows from the two distributive laws when R has a unit 
element. The commutative law of addition follows from the rest. 


Lemma 4.2.3. If R is a system with 1 satisfying all the axioms of a 
ring, except possibly a+ b= b+ afora,b € R, then Ris a ring. 


Proof. We must show that a + b = b + a for a, b € R. By the right dis- 
tributive law (a + b)\(1+1)=(a+b)1+(at+b)l=at+b+a-+b. On the 
other hand, by the left distributive law (a + b)(1 + 1) =a(1 +1) +5b(1 + 1) 
=at+a+b+b.Butthenat+b+at+b=a+a+t+b+b;since we areina 
group under +, we can cancel a on the left and 5 on the right to obtain b + a 
=a + b, as required. R is therefore a ring. L] 


We close this brief section with a result that is a little nicer. We say that 
a ring R is a Boolean ring [after the English mathematician George Boole 
(1815-1864)] if x* = x for every x € R. 


We prove a nice result on Boolean rings. 


Lemma 4.2.4. A Boolean ring is commutative. 


Proof. Let x, y © R, a Boolean ring. Thus x? = x, y* = y, (x + y)/? = 
x+y. But(x + y’=x*+xy+ yx + y?>=x+xy + yx + y, by Lemma 4.2.2, 
so we have (x + y) = (x + y)? = x + xy + yx + y, from which we have 
xy + yx = 0. Thus 0 = x(xy + yx) = x*y + xyx = xy + xyx, while 0 = 
(xy + yx)x = xyx + yx? = xyx + yx. This gives us xy + xyx = xyx + yx, and 
so xy = yx. Therefore, R is commutative. L] 
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1. Let R be a ring: since R is an abelian group under +, na has meaning for 
us forn € Z,a € R. Show that (na)(mb) = (nm)(ab) if n, m are integers 
anda,beE R. 

2. If R is an integral domain and ab = ac for a # 0, b,c € R, show that b = c. 

3. If R is a finite integral domain, show that R is a field. 

4. If R is a ring and e € R is such that e* = e, show that (xe — exe) = 
(ex — exe) = 0 for every x € R. 

5, Let R be a ring in which x° = x for every x € R. Prove that R is commu- 
tative. 

6. If a? = 0 in R, show that ax + xa commutes with a. 

7. Let R be a ring in which x* = x for every x € R. Prove that R is commu- 
tative. 

8. If Fis a finite field, show that: 

(a) There exists a prime p such that pa = 0 for alla € F. 
(b) If F has gq elements, then gq = p” for some integer n. (Hint: Cauchy’s 
Theorem) 

9, Let p be an odd prime and let 1 + § + --- + 1/(p — 1) = a/b, where a, b 

are integers. Show that p | a. (Hint: As a runs through U,, so does a™'.) 
10. If p is a prime and p > 3, show that if 1 + § + --- + 1/(p — 1) = a/b, where 
a, b are integers, then p’ | a. (Hint: Consider 1/a’ as a runs through U,,.) 


3. IDEALS, HOMOMORPHISMS, AND QUOTIENT RINGS 


In studying groups, it turned out that homomorphisms, and their kernels— 
the normal subgroups—played a central role. There is no reason to expect 
that the same thing should not be true for rings. As a matter of fact, the 
analogs, in the setting of rings, of homomorphism and normal subgroup do 
play a key role. 

With the background we have acquired about such things in group the- 
ory, the parallel development for rings should be easy and quick. And it will 
be! Without any further fuss we make the 


Definition. The mapping g: R — R' of the ring R into the ring R’ isa 
homomorphism if 


(a) g(a + b) = g(a) + v(b) and 
(b) g(ab) = ¢(a)g(b) for alla, b © R. 
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Since a ring has two operations, it is only natural and just that we de- 
mand that both these operations be preserved under what we would call a 
ring homomorphism. Furthermore, Property (a) in the Definition tells us that 
gy is a homomorphism of R viewed merely as an abelian group under + into 
R' (also viewed as a group under its addition). So we can call on, and expect, 
certain results from this fact alone. 

Just as we saw in Chapter 2, Section 5 for groups, the image of R under 
a homomorphism from R to R’, is a subring of R’, as defined in Chapter 4, 
Section 1 (Prove!). 

Let o:R — R’ be a ring homomorphism and let Kerg = 
{x © R| g(x) = 0}, the 0 being that of R’. What properties does Ker g enjoy? 
Clearly, from group theory Ker ¢ is an additive subgroup of R. But much 
more is true. If k € Ker pandr € R, then ¢(k) = 0, so g(kr) = o(k) g(r) = 
Og(r) = O, and similarly, g(rk) = 0. So Ker ¢ swallows up multiplication 
from the left and the right by arbitrary ring elements. 

This property of Ker ¢ is now abstracted to define the important analog 
in ring theory of the notion of normal subgroup in group theory. 


Definition. Let R be a ring. A nonempty subset / of R is called an 
ideal of R if: 


(a) J is an additive subgroup of R. 
(b) Givenr € R,a € J, then ra € Jandar € IJ. 


We shall soon see some examples of homomorphisms and ideals. But 
first we note that Part (b) in the definition of ideal really has a left and a right 
part. We could split it and define a set L of R to be a left ideal of R if L is an 
additive subgroup of R and givenr € R,a € L, then ra € L. So we require 
only left swallowing-up for a left ideal. We can similarly define right ideals. 
An ideal as we defined it is both a left and a right ideal of R. By all rights we 
should then call what we called an ideal a two-sided ideal of R. Indeed, in 
working in noncommutative ring theory one uses this name; here, by “ideal” 
we Shall always mean a two-sided ideal. Except for some of the problems, we 
shall not use the notion of one-sided ideals in this chapter. 

Before going on, we record what was done above for Ker ¢ as 


Lemma 4.3.1. If g:R — R’ is a homomorphism, then Ker ¢ is an 
ideal of R. 


We shall soon see that every ideal can be made the kernel of a homo- 
morphism. Shades of what happens for normal subgroups of groups! 
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Finally, let K be an ideal of R. Since K is an additive subgroup of R, the 
quotient group R/K exists; it is merely the set of all cosets a + K as a runs 
over R. But R is not just a group; it 1s, after all, a ring. Nor is K merely an ad- 
ditive subgroup of A; it is more than that, namely an ideal of R. We should 
be able to put all this together to make of R/K a ring. 

How should we define a product in R/K in a natural way? What do we 
want to declare (a + K)(b + K) to be? The reasonable thing is to define 
(a + K)(b + K) = ab + K, which we do. As always, the first thing that 
comes up is to show that this product is well-defined. Is it? We must show 
thatifa + K=a'+ Kandb+ K=b'+ K,then (a+ K)(b+ K)=ab+K= 
a'b' + K = (a' + K)(b' + K). However, ifa + K = a'+ K, thena—a' EK, 
so (a — a')b € K, since K is an ideal of R (in fact, so far, since K is a right 
ideal of R). Because b + K = b' + K, we have b — b' € K,soa'(b — b') EK, 
since K is an ideal of R (in fact, since K is a left ideal of R). So both (a — a’)b = 
ab — a'b and a'(b — b') = a'b — a'b’ are in K. Thus (ab — a’b) + (a’'b — a'b') = 
ab —a'b' EK. 

But this tells us (just from group theory) that ab + K = a'b’ + K, ex- 
actly what we needed to have the product well-defined. 

So R/K is now endowed with a sum and a product. Furthermore, the 
mapping gy: R — R/K defined by g(a) = a + K fora € R is a homomor- 
phism of R onto R/K with kernel K. (Prove!) This tells us right away that 
R/K is a ring, being the homomorphic image of the ring RA. 

We summarize all this in 


Theorem 4.3.2. Let K be an ideal of R. Then the quotient group R/K 
as an additive group is a ring under the multiplication (a + K)(b + K) = 
ab + K. Furthermore, the map g: R — R/K defined by g(a) = a+ KforaE R 
is ahomomorphism of R onto R/K having K as its kernel. So R/K is a homo- 
morphic image of R. 


Just from group-theoretic consideration of RK as an additive group, we 
have that if g is a homomorphism of R into R’, then it 1s 1-1 if and only if 
Ker g = (0). As in groups, we define a homomorphism to be a monomor- 
phism if it is 1-1. A monomorphism which is also onto is called an isomor- 
phism. We define R and R’ to be isomorphic if there is an isomorphism of R 
onto R’. 

An isomorphism from a ring R onto itself 1s called an automorphism of 
R. For example, suppose R is the field C of complex numbers. Then the map- 
ping from C to C sending each element of C to its complex conjugate is an 
automorphism of C. (Prove!) 
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One would have to be an awful pessimist to expect that the homomor- 
phism theorems proved in Section 7 of Chapter 2 fail in this setting. In fact 
they do hold, with the slightly obvious adaptation needed to make the proofs 
go through. We state the homomorphism theorems without any further ado, 
leaving the few details needed to complete the proofs to the reader. 


Theorem 4.3.3 (First Homomorphism Theorem). Let the mapping 
gy: R— R' be a homomorphism of R onto R’ with kernel K. Then R' = R/K; 
in fact, the mapping w: R/K — R’' defined by W(a + K) = g(a) defines an 
isomorphism of R/K onto R’. 


Theorem 4.3.4 (Correspondence Theorem). Let the mapping gy: R > R' 
be a homomorphism of R onto R' with kernel K. If J’ is an ideal of R’, let 
I= {a € R| ¢(a) € I'}. Then J is an ideal of R, 1 D K and I/K = I’. This sets 
up a 1-1 correspondence between all the ideals of R’ and those ideals of R 
that contain K. 


Theorem 4.3.5 (Second Homomorphism Theorem). Let A be a sub- 
ring of aring R and / an ideal of R. ThenA+J={at+ilaG€A,i€l\isa 
subring of R, J is an ideal of A + J,and(A + J)/I=A/(AN DJ. 


Theorem 4.3.6 (Third Homomorphism Theorem). Let the mapping 
gp: R— R’ be a homomorphism of R onto R’ with kernel K. If I’ is an ideal 
of R' and I = {a € R| g(a) € I}, then R/J = R'/I'. Equivalently, if K is an 
ideal of R and J D K is an ideal of R, then R/I = (R/K)/(//K). 


We close this section with an inspection of some of the things we have 
discussed in some examples. 


Examples 


1. As usual we use Z, the ring of integers, for our first example. Let n > 1 be 
a fixed integer and let J, be the set of all multiples of 1; then J, is an ideal of 
Z.\f Z, is the integers mod n, define ¢: Z — Z, by ¢(a) = [a]. As is easily 
seen, yg is a homomorphism of Z onto Z, with kernel J,,. So by Theorem 
4.3.3, Z, ~ ZI/I,. (This should come as no surprise, for that is how we origi- 
nally introduced Z,,.) 


2. Let F be a field; what can the ideals of F be? Suppose that J # (0) is an 
ideal of F; let a # 0 € I. Then, since J is an ideal of F, 1 = a-‘a € TI; but 
now, since 1 € J, rl = r € I for every r € F. In short, J = F. So F has only 
the trivial ideals (0) and F itself. 


3. Let R be the ring of all rational numbers having odd denominators in their 
reduced form. Let J be those elements of R which in reduced form have an 
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even numerator; it is easy to see that / 1s an ideal of R. Define g: R — Z,, the 
integers mod 2, by y(a/b) = 0 if a is even (a, b have no common factor) and 
(a/b) = 1 if a is odd. We leave it to the reader to verify that ¢ is a homo- 
morphism of R onto Z, with kernel J. Thus Z, = R/I. Give the explicit iso- 
morphism of R// onto Z,. 


4. Let R be the ring of all rational numbers whose denominators (when in re- 
duced form) are not divisible by p, p a fixed prime. Let J be those elements in 
R whose numerator is divisible by p; / is an ideal of R and R/I = Z,, the inte- 
gers mod p. (Prove!) 


5. Let R be the ring of all real-valued continuous functions on the closed unit 
interval where ( f + g)(x) = f(x) + g(x) and ( fg)(x) = f(x)g(x) for f. g € R, 
x € [0, 1]. Let J = {f € R|f(S) = 0}. We claim that J is an ideal of R. 
Clearly, it is an additive subgroup. Furthermore if f € J and g € R, then 
f(S) = 0, so (fg)(S) = f(S)g(s) = Og(S) = 0. Thus fg € J; since J is commuta- 
tive, gf is also in J. So J is an ideal of R. 

What is R/I? Given any f € R, then 


f(x) = (F(*) — f@)) + FG) = 8) + FQ), 


where g(x) = f(x) — f(s). Because g(5) = f(s) — f(S) = 0, g is in I. So 
g+1=I.Thusf+J=(f() +g) +1= f(s) + I. Because f($) is just a real 
number, R/I consists of the cosets a + J for a real. We claim that every real 
a comes up. For if f(3) = B # 0, then 


ap f+I=(ap' + I)(f +1) = (eB' + D(fG)+D 
=(aB'+1)\(B+D=aB'Bt+l=atl. 


So R/I consists of all a + J, a real. Thus we have shown directly that R/J is 
isomorphic to the real field. 

We now use Theorem 4.3.3 to show that R// = real field. Let gp: RR 
be defined by ¢(f) = f(§). Then it is easy to verify that ¢ is a homomor- 
phism, ¢ is onto and Ker g = {f € R| f(S) = 0}; in other words, Ker g = J. 
So R/T = image of g = R. 


6. Let R be the integral quaternions, in other words, R = {aj + a,i + a,j + 
a3k | ao, @, A, a; © Z}. Fora fixed prime p, let 


[, = {a@ + ai + ayj + ask € R| pla; fori = 0, 1, 2, 3}. 


The reader should verify that J, is an ideal of R and that R/I, ~ H(Z,) (see 
Problem 36 of Section 1 and the paragraph before it). 
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7. Let R = \(¢ 4 
0 a 
Let J = bE R| I is easily seen to be an additive subgroup of R. Is it 
x y\f0O b 
0 xj/\0 O 
O b\fx y\_ (0 bx 
0 O/\O x 0 O07? 


so it, too, is in J. So J is an ideal of R. What is R/J? We approach it from two 
points of view. 


a,beE Rt R is a subring of the 2 X 2 matrices over the 


an ideal of R? Consider 


ll 
fo 
> 

> 
Coe 
ete 


so it is in J. Similarly, 


Given (‘ ER, 


a b\_fa 0 0 b 
then (( y= (( deat 5) so that 
a b _ffa 0 0 b _fa O 
(oe) tt (6 o) +0 o))re= (5 a)! 


a OQ 


since t 4 is in /. Thus all the cosets of J in R look like (( : 


+ I. If we map 


a O 
QO a 


onto the real field. So R/J = R. 


this onto a, that is, ( +] = a, we can check that is an isomorphism 


a b 


We see that R/T = R another way. Define ¢: RR bve(¢ ,) =a. We 


claim that g is a homomorphism. For, given (¢ A t > then 


a b\ ,(c d\_fat+c b+d 
0 a Oc) \ O atc!’ 
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and (2 b\fc d\ _ fac ad + be\. hanes 
0 aj\O c 0 ac ( 
_ fat+ec brad 
P\ 9 at+c 
_ fa b c a 
ate= el) *) + o(6 
ac ad+ bc 
P\ 0 ac 


_ fa b\ [ce a 
2" F\Q a}P\o ec} 


So ¢ is indeed a homomorphism of R onto R. What is Ker ¢? If 4 E Ker g, 


aS) 
LO a 
Te 
os 
QR & 
Re 
+ 
, 
ree He 
a Qf 
See ot 
Wah 
| 


pe) 
=) 
ah 
6 
Pe 
Sn, Te 
oS) 
oS 
Se 
Ge 
So 
qa & 
EG 
ee 
lI 


then ol 7 = a, but also ol *) = 0, since (( 4 E Ker go. 


Thus a = 0. From this we see that J = Ker g. So R/I = image of g = R by 
Theorem 4.3.3. 


= a b 
aL R= {(_ 4 


Define yy: R — C by (_ 


a,beE R| and let C be the field of complex numbers. 


‘ 3 =a + bi. We leave it to the reader to verify that 
yw is an isomorphism of R onto C. So R is isomorphic to the field of complex 
numbers. 


9. Let R be any commutative ring with 1. If a € R, let (a) = {xa|x € R}. We 
claim that (a) is an ideal of R. To see this, suppose that u, v € (a); thus 
u = xa,v = ya for some x, y € R, whence 


utv=xatya=(xty)a€(a). 


Also, if u € (a) andr € R, then u = xa, hence ru = r(xa) = (rx)a, so is in (a). 
Thus (a) is an ideal of R. 

Note that if R is not commutative, then (a) need not be an ideal; but it 
is certainly a left ideal of R. 
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PROBLEMS 


Easier Problems 


1. 


*3. 


So men n m 


If R is a commutative ring and a € R, let L(a) = {x € R| xa = 0}. Prove 
that L(a) is an ideal of R. 


. If R is a commutative ring with 1 and R has no ideals other than (0) and 


itself, prove that R is a field. (Hint: Look at Example 9.) 


If ¢: R— R’' isa homomorphism of R onto R' and R has a unit element, 
1, show that ¢(1) is the unit element of R’. 


. If I, J are ideals of R, define 7+ Jby7+J={i+j|i€ Ij € J}. Prove 


that 7 + J is an ideal of R. 


. If Jis an ideal of R and A is a subring of R, show that JM A is an ideal of A. 
. If /,J are ideals of R, show that JM J is an ideal of R. 

. Give a complete proof of Theorem 4.3.2. 

. Give acomplete proof of Theorem 4.3.4. 

. Let g: R — R’' be a homomorphism of R onto R’ with kernel K. If A’ is 


a subring of R’, let A = {a € R| g(a) € A’}. Show that: 
(a) A is asubring of R, A D K. 

(b) A/K = A’. 

(c) If A’ is a left ideal of R’, then A is a left ideal of R. 


. Prove Theorem 4.3.6. 

- In Example 3, give the explicit isomorphism of R/I onto Z,. 

- In Example 4, show that R/I = Z,,. 

. In Example 6, show that R/I, ~ H(Z,). 

. In Example 8, verify that the mapping y given is an isomorphism of R onto C. 
. If J, J are ideals of R, let LJ be the set of all sums of elements of the form 


ij, where i € J,j € J. Prove that // is an ideal of R. 


. Show that the ring of 2 X 2 matrices over the reals has nontrivial left 


ideals (and also nontrivial right ideals). 


. Prove Theorem 4.3.5. 


If R, S are rings, define the direct sum of R and S, R @ S, by 
R@S= {(r,s)|/rER,s ES} 
where (7, 5) = (71, 5;) if and only if r = r,, s = s,;, and where 


(7,,5)+(@%u=(r+t6s + u), (r,s)(t,u) = (rt, su). 
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18. Show that R @ S is a ring and that the subrings {(r, 0)|r € R} and 
{(0, s) | s € S} are ideals of R @ S isomorphic to R and S, respectively. 


_|fa b _ |f0 b 
19. irr = (6 7 ab, ereal} and 1 = {(} ;) 


(a) Ris a ring. 
(b) J is an ideal of R. 
(c) R/T = F @ F, where F is the field of real numbers. 

20. If J, J are ideals of R, let R, = R/I and R, = R/J. Show that 9: R— R, @ R, 
defined by g(r) = (r + I, r + J) is a homomorphism of R into R; @ R, 
such that Ker gp = IN J. 

21. Let Z,,; be the ring of integers mod 15. Show that Z,, = Z; @ Zs. 


b real. show that: 


Middle-Level Problems 


22. Let Z be the ring of integers and m, n two relatively prime integers, J,, 
the multiples of m in Z, and J,, the multiples of n in Z. 
(a) What is /,,N I,,? 
(b) Use the result of Problem 20 to show that there is a one-to-one 

homomorphism from Z/T,,,, to Z/I,, @ Z/I,. 

23. If m,n are relatively prime, prove that Z,,,, = Z,, © Z,. (Hint: Use a count- 

ing argument to show that the homomorphism of Problem 22(b) is onto.) 
*24. Use the result of Problem 23 to prove the Chinese Remainder Theorem, 

which asserts that if m and 7 are relatively prime integers and a, b any in- 
tegers, we can find an integer x such that x = amodm andx = bmodn 
simultaneously. 

25. Let R be the ring of 2 X 2 matrices over the real numbers; suppose that J 
is an ideal of R. Show that J = (0) or J = R. (Contrast this with the result of 
Problem 16.) 


Harder Problems 


26. Let R be a ring with 1 and let S be the ring of 2 X 2 matrices over R. If J 
is an ideal of S show that there is an ideal J of R such that J consists of all 
the 2 X 2 matrices over J. 

27. If pi, P2,---, P, are distinct odd primes, show that there are exactly 2” 
solutions of x* = x mod(p,---p,), where 0 = x < py---D,. 

28. Suppose that R is a ring whose only left ideals are (0) and R itself. Prove 
that either A is a division ring or R has p elements, p a prime, and ab = 0 
for everya,bDE R. 
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29. Let R be a ring with 1. An element a € R is said to have a left inverse if 
ba = 1 for some b € R. Show that if the left inverse b of a is unique, then 
ab = 1 (so bis also a right inverse of a). 


4. MAXIMAL IDEALS 


This will be a section with one major theorem. The importance of this result 
will only become fully transparent when we discuss fields in Chapter 5. How- 
ever, it is a result that stands on its own two feet. It isn’t difficult to prove, 
but in mathematics the correlation between difficult and important isn’t al- 
ways that high. There are many difficult results that are of very little interest 
and of even less importance, and some easy results that are crucial. Of 
course, there are some results—many, many—which are of incredible diffi- 
culty and importance. 


Lemma 4.4.1. Let R be a commutative ring with unit whose only 
ideals are (0) and itself. Then R is a field. 


Proof. Let a # 0 be in R. Then (a) = {xa|x € R} is an ideal of R, as 
we verified in Example 9 in the preceding section. Since a = la € (a), (a) # (0). 
Thus, by our hypothesis on R, (a) = R. But then, by the definition of (a), 
every element i € R is a multiple xa of a for some x € R. In particular, be- 
cause 1 € R, 1 = ba for some b € R. This shows that a has the inverse D in R. 
So R is a field. (J 


In Theorem 4.3.4—the Correspondence Theorem—we saw that if 
gp: R — R’ is a homomorphism of R onto R’ with kernel K, then there is a 
1-1 correspondence between ideals of R' and ideals of R that contain K. Sup- 
pose that there are no ideals other than K itself and R which contain K. 
What does this imply about R’? Since (0) in R’ corresponds to K in R, and 
R' corresponds to all of R, we must conclude that in this case R’ has no 
ideals other than (Q) and itself. So if R’ is commutative and has a unit ele- 
ment, then, by Lemma 4.4.1, R’ must be a field. 

This prompts the following definition. 


Definition. A proper ideal M of R is a maximal ideal of R if the only 
ideals of R that contain M are M itself and R. 


The discussion preceding this definition has already almost proved 
for us 


Sec. 4 Maximal Ideals 149 


Theorem 4.4.2. Let R be a commutative ring with 1, and let M be a 
maximal ideal of R. Then R/M is a field. 


Proof. There is a homomorphism of R onto R’ = R/M, and since 1 € R 
we have that R’ has 1 + M as its unit element. (See Problem 3, Section 3). 
Because M is a maximal ideal of R, we saw in the discussion above that R’ 
has no nontrivial ideals. Thus, by Lemma 4.4.1, R’ = R/M is a field. J 


This theorem will be our entry into the discussion of fields, for it will 
enable us to construct particularly desirable fields whenever we shall need 
them. 

Theorem 4.4.2 has a converse. This is 


Theorem 4.4.3. If R is a commutative ring with 1 and M an ideal of R 
such that R/M is a field, then M is a maximal ideal of R. 


Proof. We saw in Example 2 of Section 3 that the only ideals in a field 
F are (0) and F itself. Since R/M is a field, it has only (0) and itself as ideals. 
But then, by the correspondence given us by Theorem 4.3.4, there can be no 
ideal of R property between M and R. Thus M is a maximal ideal of R. LJ 


We give a few examples of maximal ideals in commutative rings. 


Examples 


1. Let Z be the integers and M an ideal of Z. As an ideal of Z we certainly 
have that M is an additive subgroup of Z, so must consist of all multiples of 
some fixed integer n. Thus since R/M = Z, and since Z, 1s a field if and only 
if n is a prime, we see that M is a maximal ideal of Z if and only if M consists 
of all the multiples of some prime p. Thus the set of maximal ideals in Z cor- 
responds to the set of prime numbers. 


2. Let Z be the integers, and let R = {a + bila, b € Z}, a subring of 
C (i? = —1). Let M be the set of all a + bi in R, where 3|a and 3|b. We 
leave it to the reader to verify that M is an ideal of R. 

We claim that M is a maximal ideal of R. For suppose that N D M and 
N # M is an ideal of R. So there is an element r + si € N, where 3 does not 
divide r or 3 does not divide s. Therefore, 3 | (r* + s”). (Prove using congru- 
ences mod 3 !) Butt = r? + s* = (r + si)(r — si), so is in N, since r + siG N 
and N is an ideal of R. So N has an integer t = r* + s” not divisible by 3. Thus 
ut + 3v = 1 for some integers u, v; butt © N, hence ut€ Nand3 ECE MCN, 
so 3u € N. Therefore, 1 = ut + 3u € N. Therefore, (a + bi)1 € N, since N is 
an ideal of R, for alla + bi € R. This tells us that N = R. So the only ideal of 
R above M is R itself. Consequently, M is a maximal ideal of R. 
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By Theorem 4.4.2 we know that R/M is a field. It can be shown (see 
Problem 2) that R/M is a field having nine elements. 


3. Let R be as in Example 2 and let J = {a + bi| 5|a and 5|b}. We assert that 
I is not a maximal ideal of R. 

In R we can factor 5 = (2 + i)(2 — i). Let M = {x(2 + i)|x € R}. Mis 
an ideal of R, and since 5 = (2 + i)(2 — i) is in M, we see that J C M. Clearly, 
I # M for 2 + i € M and is not in J because 5/2. So ] # M. Can M = R? 
If so, then (2 + 1)(a + bi) = 1 for some a, b. This gives 2a — b = 1 and 
2b + a = 0; these two equations imply that 5a = 2, so a = 2, b = —3. But 
2¢7, -4+ €Z; the element a + bi =? —#i isnotinR.SoM#R. 

One can show, however, that M is a maximal ideal of R. (See Problem 3.) 


4. Let R = {a+ bv2 | a, b integers}, which is a subring of the real field under 
the sum and product of real numbers. That R is a ring follows from 


(a + bV2) + (c +dV2) = (a +c) + (b + d)V2 
and 
(a + bV2)(c + dV2) = (ac + 2bd) + (ad + bc)V2. 


Let M = {a + bV2 € R| 5]a and 5|b}. M is easily seen to be an ideal of R. 
We leave it to the reader to show that M is a maximal ideal of R and that 
R/M 1s a field having 25 elements. 


5. Let R be the ring of all real-valued continuous functions on the closed 
unit interval [0, 1]. We showed in Example 5 of Section 3 that if M = 
{f © R|f(§) = 0}, then M is an ideal of R and R/M is isomorphic to the real 
field. Thus, by Theorem 4.4.3, M is a maximal ideal of R. 

Of course, if we let M, = {f © R| f(y) = 0}, where y € (0, 1], then M, is 
also a maximal ideal. It can be shown that every maximal ideal in R is of the 
form M, for some y € [0, 1], but to prove it we would require some results from 
real variable theory. 

What this example shows is that the maximal ideals in R correspond to 
the points of [0, 1]. 


PROBLEMS 


1. If a, b are integers and 3} a or 3} b, show that 3} (a? + b’). 
2. Show that in Example 2, R/M is a field having nine elements. 
3. In Example 3, show that M = {x(2 + i)|x € R} is a maximal ideal of R. 
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4. In Example 3, show that R/M = Zs. 

In Example 3, show that R/I = Z; @ Zs. 

In Example 4, show that M is a maximal ideal of R. 

In Example 4, show that R/M is a field having 25 elements. 


PI Aw 


Using Example 2 as a model, construct a field having 49 elements. 


We make a short excursion back to congruences mod p, where p is an odd 
prime. If a is an integer such that p} a and x = a mod p has a solution x in Z, 
we Say that a is a quadratic residue mod p. Otherwise, a is said to be a quadratic 
nonresidue mod p. 


9. Show that (p — 1)/2 of the numbers 1, 2, ... , p — 1 are quadratic 
residues and (p — 1)/2 are quadratic nonresidues mod p. [Hint: Show 
that {x*| x # 0 € Z,} forms a group of order (p — 1)/2.] 

10. Let m > 0 be in Z, and suppose that m is not a square in Z. Let R = 
{a + Vm b| a, b EZ}. Prove that under the operations of sum and 
product of real numbers R is a ring. 


11. If p is an odd prime, let us set J, = {a +Vm b| pl\a and p|b}, where 
a+ Vmb ER, the ring in Problem 10. Show that J, is an ideal of R. 


12. If m is a quadratic nonresidue mod p, show that the ideal J, in Problem 
11 is a maximal ideal of R. 


13. In Problem 12 show that R/J, is a field having p* elements. 


5. POLYNOMIAL RINGS 


The material that we consider in this section involves the notion of polyno- 
mial and the set of all polynomials over a given field. We hope that most 
readers will have some familiarity with the notion of polynomial from their 
high school days and will have seen some of the things one does with polyno- 
mials: factoring them, looking for their roots, dividing one by another to get 
a remainder, and so on. The emphasis we shall give to the concept and alge- 
braic object known as a polynomial ring will be in a quite different direction 
from that given in high school. 

Be that as it may, what we shall strive to do here is to introduce the 
ring of polynomials over a field and show that this ring is amenable to a care- 
ful dissection that reveals its innermost structure. As we shall see, this ring is 
very well-behaved. The development should remind us of what was done for 
the ring of integers in Section 5 of Chapter 1. Thus we shall run into the ana- 
log of Euclid’s algorithm, greatest common divisor, divisibility, and possibly 
most important, the appropriate analog of prime number. This will lead to 
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unique factorization of a general polynomial into these “prime polyno- 
mials,” and to the nature of the ideals and the maximal ideals in this new 
setting. 

But the polynomial ring enjoys one feature that the ring of integers did 
not: the notion of a root of a polynomial. The study of the nature of such 
roots—which will be done, for the most part, in the next chapter—consti- 
tutes a large and important part of the algebraic history of the past. It goes 
under the title Theory of Equations, and in its honorable past, a large variety 
of magnificent results have been obtained in this area. Hopefully, we shall 
see some of these as our development progresses. 

With this sketchy outline of what we intend to do out of the way, we 
now get down to the nitty-gritty of doing it. 

Let F be a field. By the ring of polynomials in x over F, which we shall 
always write as F[x], we mean the set of all formal expressions p(x) = 


Ay + a,x +--+ + a,x"! + a,x", n = 0, where the a,, the coefficients of 
the polynomial p(x), are in F. We sometimes employ the alternative nota- 
tion: p(x) = agx"” + a,x"! +--+ + a,. In F[x] we define equality, sum, and 


product of two polynomials so as to make of F[x] a commutative ring as fol- 
lows: 


1. Equality. We declare p(x) = aj) + a,x +++: + a,x" and q(x) = 
by + bx + +++ + b,x" to be equal if and only if their corresponding coeffi- 
cients are equal, that is, if and only if a; = b; for all i= 0. 

We combine this definition of equality of polynomials p(x) and q(x) 
with the convention that if 


n 


q(x) = bo + byx + +++ + b,x 


and if 5,,,; =-:: = 5b, = 0, then we can drop the last n — m terms and write 
q(x) as 


Ox) = bot bie Fee 


This convention is observed in the definition of addition that follows, 
where s is the larger of m and n and we add coefficients a,,, =--: =a, = 0 
ifn<s or 5,4, =:°°:=b,=0 if m<s. 


2. Addition. If p(x) = ay + a,x + +++ + a,x" and q(x) = by + bx 
+--+» + b,x”, we declare p(x) + q(x) = co + cyx + +++ + c,x°, where for 
each i,c; = a; + b,. 

So we add polynomials by adding their corresponding coefficients. 

The definition of multiplication is a little more complicated. We define 
it loosely at first and then more precisely. 
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3. Multiplication. If p(x) = ay + a,x + +++ + a,x" and q(x) = 
by + byx + +++ + b,,x™, we declare p(x)q(x) = co + cyx + +++ + c,x', where 
the c,; are determined by multiplying the expression out formally, using the 
distributive laws and the rules of exponents x“x” = x“*", and collecting 
terms. More formally, 


C; = a; bo “e a;-4b, 4 0 oF a,b;_, “fs Ab; for every L. 


We illustrate these operations with a simple example, but first a nota- 
tional device: If some coefficient is 0, we just omit that term. Thus we write 
9+ Ox + 7x* + Ox* — 14x* as 9 + 7x? — 14x7. 

Let p(x) = 1 + 3x’, q(x) = 4 -— 5x + 7x? — x°. Then p(x) + q(x) = 
5 — 5x + 10x* — x? while 


p(x)q(x) = (1 + 3x’)(4 — 5x + 7x? — x’) 
= 4 — 5x + 7x? — x3 + 3x2(4 — 5x + 7x? — x) 
= 4 — 5x + Tx? — x? + 12x? — 15x? + 21x* -— 3x° 
= 4 — 5x + 19x? — 16x° + 21x* ~ 3x°. 


Try this product using the c; as given above. 

In some sense this definition of F[x] is not a definition at all. We have 
indulged in some hand waving in it. But it will do. We could employ se- 
quences to formally define F[x] more precisely, but it would merely cloud 
what to most readers is well known. 

The first remark that we make—and do not verify—is that F[x] is a 
commutative ring. To go through the details of checking the axioms for a 
commutative ring is a straightforward but laborious task. However, it is im- 
portant to note 


Lemma 4.5.1. F[x] is a commutative ring with unit. 


Definition. If p(x) = ay + ayx +--+: + a,x" and a, # O, then the 
degree of p(x), denoted by deg p(x), is n. 


So the degree of a polynomial p(x) is the highest power of x that occurs in 
the expression for p(x) with a nonzero coefficient. Thus deg(x — x* + x*) = 4, 
deg(7x) = 1, deg 7 = 0. (Note that this definition does not assign a degree to 
0. It is, however, sometimes convenient to adopt the convention that the de- 
gree of 0 be —%™, in which case many degree related results will hold in this 
extended context.) The polynomials of degree 0 and the polynomial 0 are 
called the constants; thus the set of constants can be identified with F. 

The degree function on F[x] will play a similar role to that played by 
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the size of the integers in Z, in that it will provide us with a version of Eu- 
clid’s Algorithm for F[x]. 

One immediate and important property of the degree function is that it 
behaves well for products. 


Lemma 4.5.2. If p(x), q(x) are nonzero elements of F[x], then 
deg(p(x)q(x)) = deg p(x) + deg q(x). 


Proof. Let m = deg p(x) and n = deg q(x); thus the polynomial p(x) = 
Ay + ayx +--+ + a,x”, where a,, # 0, and the polynomial g(x) = by + 
bix +--+ + b,x", where b, # 0. The highest power of x that can occur in 
p(x)q(x) is x”*", from our definition of the product. What is the coefficient 
of x”""? The only way that x”*" can occur is from (a,,x”™)(b,x") = 
a,,b,x""". So the coefficient of x’”*” in p(x)q(x) is a,,b,,, which is not 0, since 
An FO, b, # 0. Thus deg(p(x)q(x)) = m + n = deg p(x) + deg q(x), as 
claimed in the lemma. [] 


One also has some information about deg(p(x) + q(x)). This is 


Lemma 4.5.3. If p(x), q(x) © F[x] and p(x) + q(x) # 0, then 
deg(p(x) + q(x)) = max(deg p(x), deg q(x)). 


We leave the proof of Lemma 4.5.3 to the reader. It will play no role in 
what is to come, whereas Lemma 4.5.2 will be important. We put it in so that 
the “+” should not feel slighted vis-a-vis the product. 

An immediate consequence of Lemma 4.5.2 is 


Lemma 4.5.4. F [x] is an integral domain. 


Proof. If p(x) # 0 and q(x) # 0, then deg p(x) = 0, deg q(x) = 0, so 
deg( p(x)q(x)) = deg p(x) + deg q(x) = 0. Therefore, p(x)q(x) has a degree, 
so cannot be 0 (which has no degree assigned to it). Thus F[x] is an integral 
domain. [J 


One of the things that we were once forced to learn was to divide 
one polynomial by another. How did we do this? The process was 
called long division. We illustrate with an example how this was done, 
for what we do in the example is the model of what we shall do in the gen- 
eral case. 

We want to divide 2x* + 1 into x* — 7x + 1. We do it schematically as 
follows: 
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by? — 3 
(2x7 + 1))x*  - —-7x +1 
xt + $x? 

a ee 
“be 3 


and we interpret this as: 
—Ike+1= (Qe? + DG?-3)+(-k+) 


and —7x + 2 is called the remainder in this division. 

What exactly did we do? First, where did the $x” come from? It came 
from the fact that when we multiply 2x* + 1 by $x? we get x‘, the highest 
power occurring in x* — 7x + 1. So subtracting $x7(2x* + 1) from x* — 7x + 1 
gets rid of the x* term and we go on to what is left and repeat the procedure. 

This “repeat the procedure” suggests induction, and that is how we 
shall carry out the proof. But keep in mind that all we shall be doing 1s what 
we did in the example above. 

What this gives us is something like Euclid’s Algorithm, in the integers. 
However, here we call it the Division Algorithm. 


Theorem 4.5.5 (Division Algorithm). Given the polynomials f(x), 
g(x) © F[x], where g(x) # 0, then 


f(x) = q(x)g(x) + r(x), 
where q(x), r(x) © F[x] and r(x) = 0 or deg r(x) < deg g(x). 


Proof. We go by induction on deg f(x). If either f(x) = 0 or deg f(x) < 
deg g(x), then f(x) = Og(x) + f(x), which satisfies the conclusion of the 
theorem. 

So suppose that deg f(x) = deg g(x); thus the polynomial f(x) = 
Ay tax +--+ +a,,x™", where a,, # 0 and the polynomial g(x) = by) + b,x 
+-++-+5,x", where b, #0 and where m = n. 

Consider 


ae ee) = ex "(bo bax <2 pe) 


n 


m 


AmDo ae 
So ey ee 


b. 
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Thus (a,,/b,,)x” "g(x) has the same degree and same leading coefficient as does 
f(x), so f(x) — (a,,/b,)x™ "g(x) = h(x) is such that the relation deg h(x) < 
deg f(x) holds. Thus, by induction, 


h(x) = qi (x)g(x) + r(x), where q,(x), r(x) € F [x] 


and r(x) = 0 or deg r(x) < deg g(x). Remembering what h(x) is, we get 


h(x) = f(x) — Fxg (x) = ailx)g(e) + r(x) 
SO 


One (Fe ven bg (s)) 80 “@), 


If q(x) = (a,,/b,)x™ " + q,(x), we have achieved the form claimed in the 
theorem. [1] 


The Division Algorithm has one immediate application: It allows us to 
determine the nature of all the ideals of F[x]. As we see in the next theorem, 
an ideal of F[x] must merely consist of all multiples, by elements of F[x], of 
some fixed polynomial. 


Theorem 4.5.6. If J # (0) is an ideal of F[x], then J = { f(x)g(x) | f(x) € 
F[x]}; that is, J consists of all multiples of the fixed polynomial g(x) by the el- 
ements of F[x]. 


Proof. To prove the theorem, we need to produce that fixed polyno- 
mial g(x). Where are we going to dig it up? The one control we have numeri- 
cally on a given polynomial is its degree. So why not use the degree function 
as the mechanism for finding g(x)? 

Since J # (0) there are elements in J having nonnegative degree. So 
there is a polynomial g(x) # 0 in J of minimal degree; that is, g(x) # 0 is in J 
and if 0 # t(x) € J, then deg t(x) = deg g(x). Thus, by the division algorithm, 
t(x) = q(x)g(x) + r(x), where r(x) = 0 or deg r(x) < deg g(x). But since 
g(x) € I and [is an ideal of F[x], we have that q(x)g(x) € I. By assumption, 
t(x) € J, thus t(x) — q(x)g(x) is in I, so r(x) = t(x) — q(x)g(x) is in I. Since 
g(x) has minimal degree for the elements of J and r(x) € J, deg r(x) cannot 
be less than deg g(x). So we are left with r(x) = 0. But this says that t(x) = 
q(x)g(x). So every element in J is a multiple of g(x). On the other hand, 
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since g(x) € J and / is an ideal of F[x], f(x)g(x) € I for all f(x) © F[x]. The 
net result of all this is that J = { f(x)g(x) | f(x) € F[x]}. 0 


Definition. An integral domain R 1s called a principal ideal domain if 
every ideal J in R is of the form J = {xa| x € R} for some a € I. 


Theorem 4.5.6 can be stated as: F[x] is a principal ideal domain. 

We shall write the ideal generated by a given polynomial, g(x), namely 
{ f(x)g(x) | f(x) © F[x]}, as (g(x). 

The proof showed that if J is an ideal of F[x], then J = (g(x)), where 
g(x) is a polynomial of lowest degree contained in J. But g(x) is not unique, for 
if a + 0 € F, then ag(x) is in J and has the same degree as g(x), so J = (ag(x)). 

To get some sort of uniqueness in all this, we single out a class of poly- 
nomials. 


Definition. f(x) € F[x] is a monic polynomial if the coefficient of its 
highest power is 1. 


Thus f(x) is monic means that 
F(x) =x" + a,x + +x + ag. 


We leave to the reader to show that if J is an ideal of F[x], then there is 
only one monic polynomial of lowest degree in /. Singling this out as the gen- 
erator of [ does give us a “monic” uniqueness for the generation of J. 

Our next step in this parallel development with what happens in the in- 
tegers is to have the notion of one polynomial dividing another. 


Definition. Suppose f(x), g(x) © F[x], with g(x) # 0. We say that 
g(x) divides f(x), written as g(x) | f(x), if f(x) = a(x)g(x) for some a(x) € 
F[x]. 


Note that if f(x) # 0 and g(x) | f(x), then deg g(x) < deg f(x) by Lemma 
4.5.2. Moreover, the ideals (f(x)) and (g(x)) of F[x], generated by f(x) and 
g(x), respectively, satisfy the containing relation (f(x)) C (g(x)). (Prove!) 

We again emphasize the parallelism between Z, the integers, and F[x] 
by turning to the notion of greatest common divisor. In order to get some sort 
of uniqueness, we shall insist that the greatest common divisor always be a 
monic polynomial. 
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Definition. For any two polynomials f(x) and g(x) € F[x] (not both 0), 
the polynomial d(x) € F[x] is the greatest common divisor of f(x), g(x) if 
d(x) is a monic polynomial such that: 

(a) d(x) | f(x) and d(x) | g(x). 
(b) If h(x) | f(x) and h(x) | g(x), then h(x) | d(x). 


Although we defined the greatest common divisor of two polynomials, 
we neither know, as yet, that it exists, nor what its form may be. We could 
define it in another, and equivalent, way as the monic polynomial of highest 
degree that divides both f(x) and g(x). If we did that, its existence would be 
automatic, but we would not know its form. 


Theorem 4.5.7. Given f(x) and g(x) # 0 in F[x], then their greatest 
common divisor d(x) € F[x] exists; moreover, d(x) = a(x) f(x) + b(x)g(x) 
for some a(x), b(x) € Fix]. 


Proof. Let I be the set of all r(x) f(x) + s(x)g(x) as r(x), s(x) run 
freely over F [x]. We claim that J is an ideal of R. For, 


(ri(x)F CX) + six)g(x)) + (rae) F(x) + 52x) 8(%)) 
= (n(x) + r(x))F&) + ile) + 52(x))8(), 
so is again in J, and for t(x) € F[x], 
t(x)(r(x) F(x) + s(x)g(x)) = (Ora) FA) + (CX)s(*)) 8), 


so it, too, is again in J. Thus J is an ideal of F[x]. Since g(x) # 0, we know 
that J # 0, since both f(x) and g(x) are in J. 

Since J # 0 is an ideal of F[x], it is generated by a unique monic polyno- 
mial d(x) (Theorem 4.5.6). Since f(x), g(x) are in J, they must then be multi- 
ples of d(x) by elements of F[x]. This assures us that d(x)| f(x) and 
d(x) | g(x). 

Because d(x) € J and J is the set of all r(x) f(x) + s(x)g(x), we have 
that d(x) = a(x)f(x) + b(x)g(x) for some appropriate a(x), b(x) € F[x]. 
Thus if h(x) | f(x) and h(x) | g(x), then h(x) | (a(x)f(x) + b(x)g(x)) = d(x). 
So d(x) is the greatest common divisor of f(x) and g(x). 

This proves the theorem; the uniqueness of d(x) is guaranteed by the 
demand that we have made that the greatest common divisor be monic. ] 


Another way to see the uniqueness of d(x) is from 


Lemma 4.5.8. If f(x) # 0, g(x) # 0 are in F[x] and f(x) | g(x) and 
g(x) | f(x), then f(x) = ag(x), where a € F. 
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Proof. By the mutual divisibility condition on f(x) and g(x) we have, 
by Lemma 4.5.2, deg f(x) = deg g(x) = deg f(x), so deg f(x) = deg g(x). But 
f(x) = a(x)g(x), so 

deg f(x) = deg a(x) + deg g(x) = deg a(x) + deg f(x), 


in consequence of which deg a(x) = 0, so a(x) = a, an element of F. 1 


We leave the proof of the uniqueness of the greatest common divisor 
via Lemma 4.5.8 to the reader. 


Definition. The polynomials f(x), g(x) in F[x] are said to be relatively 
prime if their greatest common divisor 1s 1. 


Although it is merely a very special case of Theorem 4.5.7, to empha- 
size it and to have it to refer to, we state: 


Theorem 4.5.9. If f(x), g(x) © F[x] are relatively prime, then 
a(x)f(x) + b(x)g(x) = 1 for some a(x), b(x) € Fx]. Conversely, if 
a(x)f(x) + b(x)g(x) = 1 for some a(x), b(x) € F[x], then f(x) and g(x) are 
relatively prime. 


Proof. We leave this “conversely” part to the reader as exercise. ] 
As with the integers, we have 


Theorem 4.5.10. If q(x) and f(x) are relatively prime and if 
q(x) | f@)g(x), then q(x) | g(x). 


Proof. By Theorem 4.5.9 a(x) f(x) + b(x)q(x) = 1 for some a(x), 
b(x) € F|x]. Therefore, 


a(x)f (x)g(x) + b(x)q(x)g(x) = g(x). (1) 


Since g(x) | b(x)g(x)q(x) and q(x) | f(x)g(x) by hypothesis, q(x) divides the 
left-hand side of the relation in (1). Thus q(x) divides the right-hand side 
of (1), that is, g(x) | g(x), the desired conclusion. 1 


We are now ready to single out the important class of polynomials that 
will play the same role as prime objects in F[x] as did the prime numbers in Z. 


Definition. The polynomial p(x) € F[x] is irreducible if p(x) is of pos- 
itive degree and given any polynomial f(x) in F[x], then either p(x) | f(x) or 
p(x) is relatively prime to f(x). 
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We should note here that the definition implies that a polynomial p(x) 
of positive degree is irreducible in F[x] if and only if p(x) cannot be factored 
as a product of two polynomials of positive degree. In other words, if p(x) = 
a(x)b(x), where a(x) and b(x) are in F[x], then either a(x) is a constant (that 
is, an element of F), or b(x) is constant. The proof of this fact is very similar 
to the proof of an analogous observation concerning two equivalent defini- 
tions of a prime number. 

First suppose that p(x) (of positive degree) cannot be factored as a 
product of two non-constant polynomials. Then, given any f(x) in F[x], we 
have only two possibilities for (p(x), f(x)), namely 1 or a monic polynomial of 
the form c-p(x), where c is an element of F. Thus (p(x), f(x)) = 1 or p(x) | f(x), 
which shows that p(x) is irreducible. 

Now let p(x) be irreducible in F[x] and suppose p(x) = a(x)b(x) for 
some a(x), b(x) in F[x]. According to the definition, we must have 
p(x) | a(x) or (p(x), a(x)) = 1. If p(x) | a(x), then b(x) must be a constant. If, 
on the other hand, p(x) and a(x) are relatively prime, then by Theorem 
4.5.10, p(x) | b(x), and in this case a(x) must be a constant. This shows that an 
irreducible polynomial cannot be factored as a product of two non-constant 
polynomials. 


Note that the irreducibility of a polynomial depends on the field F. For 
instance, the polynomial x* — 2 is irreducible in Q[x], where Q is the field of 
rational numbers, but x” — 2 is not irreducible in R[x], where R is the field of 
real numbers, for in R[x] 


x? —2 = (x — V2)(x + V2). 
Corollary to Theorem 4.5.10. If p(x) is irreducible in F[x] and 


p(x) | a,(x)a,(x) «++ a,(x), where a,(x),..., a,(x) are in F[x], then p(x) | a;(x) 
for some /. 


Proof. We leave the proof to the reader. (See Theorem 1.5.6.) LJ 


Aside from its other properties, an irreducible polynomial p(x) in F[x] 
enjoys the property that (p(x)), the ideal generated by p(x) in F[x], is a max- 
imal ideal of F[x]. We prove this now. 


Theorem 4.5.11. If p(x) € Fix], then the ideal (p(x)) generated by 
p(x) in F[x] is a maximal ideal of F[x] if and only if p(x) is irreducible in F[x]. 
Proof. We first prove that if p(x) is irreducible in F[x], then the ideal 


M = (p(x)) is a maximal ideal of F[x]. For, suppose that N is an ideal of 
Fx], and N D> M. By Theorem 4.5.6, 
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N = (f(x)) forsome f(x) € F[x]. 


Because p(x) © M CN, p(x) = a(x)f(x), since every element in N is of this 
form. But p(x) is irreducible in F[x], hence a(x) is a constant or f(x) is a con- 
stant. If a(x) = a € F, then p(x) = af(x), so f(x) = a ‘p(x). Thus f(x) € M, 
which says that N C M, hence N = M. On the other hand, if f(x) = b € F, 
then 1 = b™'b E N, since N is an ideal of F[x], thus g(x)1 € JN for all 
g(x) € F[x]. This says that N = F[x]. Therefore, we have shown M to be a 
maximal ideal of F[x]. 

In the other direction, suppose that M = (p(x)) is a maximal ideal of 
F[x]. If p(x) is not irreducible, then p(x) = a(x)b(x), where deg a(x) = 1, 
deg b(x) = 1. Let N = (a(x)); then, since p(x) = a(x)b(x), p(x) € N. There- 
fore, M C N. Since deg a(x) = 1, N = (a(x)) # F[x], since every element in 
(a(x)) has degree at least that of a(x). By the maximality of M we conclude 
that M = N. But then a(x) € N = M, which tells us that a(x) = f(x)p(x); 
combined with p(x) = a(x)b(x) = b(x)f(x)p(x), we get that b(x)f(x) = 1. 
Since deg 1 = 0 < deg b(x) S deg(b(x)f(x)) = deg 1 = 0, we have reached a 
contradiction. Thus p(x) is irreducible. 


This theorem is important because it tells us exactly what the maximal 
ideals of F[x] are, namely the ideals generated by the irreducible polynomi- 
als. If M is a maximal ideal of F[x], F[x]/M is a field, and this field contains 
F (or more precisely, the field {a + M|a € F}, which is isomorphic to F). 
This allows us to construct decent fields K D F, the decency of which lies in 
that p(x) has a root in K. The exact statement and explanation of this we 
postpone until Chapter 5. 

The last topic in this direction that we want to discuss 1s the factoriza- 
tion of a given polynomial as a product of irreducible ones. Note that if 
p(x) = agx" + ayx” | +--+ + a,_,x + a,, A) # 0, is irreducible in F[x], then 
SO iS a) p(x) irreducible in F[x]; however, aj 'p(x) has the advantage of 
being monic. So we have this monic irreducible polynomial trivially obtain- 
able from p(x) itself. This will allow us to make more precise the uniqueness 
part of the next theorem. 


Theorem 4.5.12. Let f(x) © F[x] be of positive degree. Then either 
f(x) is irreducible in F[x] or f(x) is the product of irreducible polynomials in 
F[x]. In fact, then, 


f(x) = apy(x)"™'po(x)" + + + py (x), 


where a is the leading coefficient of f(x), p;(x),..., pg(x) are monic and ir- 
reducible in F[x], m,; > 0, ... , m, > 0 and this factorization in this form is 
unique up to the order of the p,(x). 
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Proof. We first show the first half of the theorem, namely that f(x) is 
irreducible or the product of irreducibles. The proof is exactly the same as 
that of Theorem 1.5.7, with a slight, obvious adjustment. 


We go by induction on deg f(x). If deg f(x) = 1, then f(x) = ax + b 
with a # 0 and is clearly irreducible in F[x]. So the result is true in this case. 

Suppose, then, that the theorem is correct for all a(x) € F[x] such that 
deg a(x) < deg f(x). If f(x) is irreducible, then we have nothing to prove. 
Otherwise, f(x) = a(x)b(x), a(x) and b(x) € F[x] and deg a(x) < deg f(x) 
and deg b(x) < deg f(x). By the induction, a(x) [and b(x)] is irreducible or is 
the product of irreducibles. But then f(x) is the product of irreducible poly- 
nomials in F[x]. This completes the induction, and so proves the opening half 
of the theorem. 

Now to the uniqueness half. Again we go by induction on deg f(x). If 
deg f(x) = 1, then f(x) is irreducible and the uniqueness is clear. 

Suppose the result true for polynomials of degree less than deg f(x). 
Suppose that 


f(x) = ap, (x) *po(xy"? ++ + py(xy"™ = aq,(x)"" +++ g(x)", 
where the p,(x), g;(x) are monic irreducible polynomials and the m,, n; are 
all positive and a is the leading coefficient of f(x), that is, the coefficient 
of the highest power term of f(x). Since p,(x)|f(x), we have that 
p(x) | qy(x)"' «++ g,(x)", so by the corollary to Theorem 4.5.10, p,(x) | g;(x) 
for some i. Since q,(x) is monic and irreducible, as is p,(x), we get p,(x) = 
q;(x). We can suppose (on renumbering) that p,(x) = q,(x). Thus 


AES = ap, (x) "po(x)"? > pea)" 
= ap,(x)"!~"q2(x)"? > + + g(x)”. 


By induction we have unique factorization in the required form for 
f(x)/p;(x), whose degree is less than deg f(x). Hence we obtain that 
m,-1=n,-—1 (som, =n,), mM, = no, ... , Mm = m,r = k and p2(x) = 
G2(X), -.- , P(x) = q,(x), On renumbering the q’s appropriately. This com- 
pletes the induction and proves the theorem. [_] 


We have pointed out how similar the situation 1s for the integers Z and 
the polynomial ring F [x]. This suggests that there should be a wider class of 
rings, of which the two examples Z and F'[x] are special cases, for which 
much of the argumentation works. It worked for Z and F [x] because we had 
a measure of size in them, either by the size of an integer or the degree of a 
polynomial. This measure of size was such that it allowed a Euclid-type algo- 
rithm to hold. 
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This leads us to define a class of rings, the Euclidean rings. 


Definition. An integral domain R is a Euclidean ring if there is a 
function d from the nonzero elements of R to the nonnegative integers that 
satisfies: 


(a) Fora #0,b #0 ER, d(a) = d(ab). 
(b) Given a # 0, b # 0, there exist g andr € R such that b = ga + r, where 
r= Oord(r) < d(a). 


The interested student should try to see which of the results proved for 
polynomial rings (and the integers) hold in a general Euclidean ring. Aside 
from a few problems involving Euclidean rings, we shall not go any further 
with this interesting class of rings. 

The final comment we make here is that what we did for polynomials 
over a field we could try to do for polynomials over an arbitrary ring. That is, 
given any ring R (commutative or noncommutative), we could define the 
polynomial ring R[x] in x over R by defining equality, addition, and multipli- 
cation exactly as we did in F[x], for F a field. The ring so constructed, R[x], is 
a very interesting ring, whose structure is tightly interwoven with that of R it- 
self. It would be too much to expect that all, or even any, of the theorems 
proved in this section would carry over to R[x] for a general ring R. 


PROBLEMS 
In the following problems, F will always denote a field. 
Easier Problems 


1. If F is a field, show that the only invertible elements in F[x] are the 
nonzero elements of F. 

2. If R is a ring, we introduce the ring R[x] of polynomials in x over R, just as 
we did F[x]. Defining deg f(x) for f(x) © R[x] as we did in F[x], show that: 
(a) deg( f(x)g(x)) = deg f(x) + deg g(x) if f(x)g(x) # 0. 

(b) There is a commutative ring R such that we can find f(x), g(x) in 
R[x] with deg( f(x)g(x)) < deg f(x) + deg g(x). 

3. Find the greatest common divisor of the following polynomials over Q, 
the field of rational numbers. 
(a) x° — 6x + 7andx + 4. 
(b) x? — 1 and 2x’ — 4x° + 2. 
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(c) 3x* + 1andx® +x4+x+4+1. 
(d) x? —landx’—x*+x°-1. 
4. Prove Lemma 4.5.3. 


5. In Problem 3, let J = { f(x)a(x) + g(x)b(x)}, where f(x), g(x) run over 
© [x] and a(x) is the first polynomial and b(x) the second one in each part 
of the problem. Find d(x), so that J = (d(x)) for Parts (a), (b), (c), 
and (d). 


6. If g(x), f(x) € F[x] and g(x) | f(x), show that (f(x)) C (g(x)). 

7. Prove the uniqueness of the greatest common divisor of two polynomials 
in F[x] by using Lemma 4.5.8. 

8. If f(x), g(x) € F[x] are relatively prime and f(x) | h(x) and g(x) | h(x), 
show that f(x)g(x) | h(x). 

9. Prove the Corollary to Theorem 4.5.10. 


10. Show that the following polynomials are irreducible over the field F indi- 

cated. 

(a) x? + 7 over F = real field = R. 

(b) x? — 3x + 3 over F = rational field = Q. 
(c) x7 +x+1loverF=Z,. 

(d) x7 + 1 over F = Zio. 

(e) x? — 9over F = Z,;. 

(f) x* + 2x? +2 over F= Q. 

11. If p(x) € F[x] is of degree 3 and p(x) = ayx* + a,x* + ax + a3, show 
that p(x) if irreducible over F if there is no element r € F such that 
p(r) = aor? + ayr’? + anr + a, = 0. 

12. If F C K are two fields and f(x), g(x) € F[x] are relatively prime in F[x], 
show that they are relatively prime in K[x]. 


Middle-Level Problems 


13. Let R be the field of real numbers and C that of complex numbers. Show 
that R[x]/(x* + 1) = C. [Hint: If A = R[x]/(x* + 1), let u be the image of 
x in A; show that every element in A is of the form a + bu, where 
a,b © Rand uv? = -1)] 
14. Let F = Z,,, the integers mod 11. 
(a) Let p(x) = x* + 1; show that p(x) is irreducible in F[x] and that 
F[x]/(p (x)) is a field having 121 elements. 
(b) Let p(x) = x° + x + 4 © F[x]; show that p(x) is irreducible in F[x] 
and that F[x]/(p(x)) is a field having 11° elements. 
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15. 


16. 


17. 


18. 


19. 
20. 
21. 
22. 


Let F = Z, be the field of integers mod p, where p is a prime, and let 
q(x) € F[x] be irreducible of degree n. Show that F[x]/(q(x)) is a field 
having at most p” elements. (See Problem 16 for a more exact state- 
ment.) 

Let F, q(x) be as in Problem 15; show that F[x]/(q(x)) has exactly p” ele- 
ments. 

Let p,(x), p2(x),..., p(x) © Fx] be distinct irreducible polynomials 
and let q(x) = p;(x)p.(x) -:- p,(x). Show that 


STEIN 25 aE ae aoe 
(q(x)) (pix) ~~ (P2()) (p,(x)) 


Let F be a finite field. Show that F[x] contains irreducible polynomials of 
arbitrarily high degree. (Hint: Try to imitate Euclid’s proof that there is 
an infinity of prime numbers.) 


Construct a field having p* elements, for p an odd prime. 
If R is a Euclidean ring, show that every ideal of R is principal. 
If R is a Euclidean ring, show that R has a unit element. 


If R is the ring of even integers, show that Euclid’s algorithm is false in R 
by exhibiting two even integers for which the algorithm does not hold. 


Harder Problems 


23. 


24. 


25. 


26. 


Let F = Z, and let p(x) = x°? — 2 and q(x) = x° + 2 be in F[x]. Show 
that p(x) and q(x) are irreducible in F[x] and that the fields F[x]/(p(x)) 
and F[x]/(q(x)) are isomorphic. 

Let Q be the field of rational numbers, and let g(x) = x* + x + 1 be in 
Q(x). If a is a complex number such that a* + a + 1 = 0, show that the 
set {a + ba| a, b € Q} is a field in two ways; the first by showing it to be 
isomorphic to something you know 1s a field, the second by showing that 
if a + ba # 0, then its inverse is of the same form. 

If p is a prime, show that g(x) =1+x+x?+---x?™ is irreducible in 
Q[x]. 

Let R be a commutative ring in which a” = 0 only if a = 0. Show that if 
q(x) € R[x] is a zero-divisor in R[x], then, if 


q(x) = ox” + ax” * +--+ +a4,, 


there is an element b # 0in R such that ba, = ba, =--:: = ba, = 
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27. Let R be a ring and J an ideal of R. If R[x] and J[x] are the polynomial 
rings in x over R and J, respectively, show that: 
(a) I[x] is an ideal of R[x]. 
(b) R[x|/I[x] = (R/D) [x]. 


Very Hard Problems 


*28. Do Problem 26 even if the condition “a” = 0 only if a = 0” does not hold 
in R. 

29. Let R = {a + bila, b integers} C C. Let d(a + bi) = a* + b’. Show that 
R is a Euclidean ring where d is its required Euclidean function. (R is 
known as the ring of Gaussian integers and plays an important role in 
number theory.) 


6. POLYNOMIALS OVER THE RATIONALS 


In our consideration of the polynomial ring F[x] over a field F, the particular 
nature of F never entered the picture. All the results hold for arbitrary fields. 
However, there are results that exploit the explicit character of certain fields. 
One such field is that of the rational numbers. 

We shall present two important theorems for [x], the polynomial ring 
over the rational field @. These results depend heavily on the fact that we are 
dealing with rational numbers. The first of these, Gauss’ Lemma, relates the 
factorization over the rationals with factorization over the integers. The sec- 
ond one, known as the Eisenstein Criterion, gives us a method of constructing 
irreducible polynomials of arbitrary degree, at will, in Q[x]. In this the field Q 
is highly particular. For instance, there is no easy algorithm for obtaining irre- 
ducible polynomials of arbitrary degree n over the field Z, of the integers 
mod p, p a prime. Even over Z, such an algorithm is nonexistent; it would be 
highly useful to have, especially for coding theory. But it just doesn’t exist— 
so far. 

We begin our consideration with two easy results. 


Lemma 4.6.1. Let f(x) € Q[x]; then 
f(x) = = (agx" + a,x""| + +++ +a,) 


where u, Mm, ao,..., a, are integers and the ao, a,,..., a, have no common 
factor greater than 1 (i.e., are relatively prime) and (u, m) = 1. 
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Proof. Since f(x) € Q[x], f(x) = qox" + qx"! +--+ + q,, where the 

q; are rational numbers. So for i = 0, 1, 2,..., 7, q; = b;/c;, where b;, c; are 
integers. Thus 

f(x) = 20 yn + 2 yal ss tes wae ae 


n 


om 


clearing of denominators gives us 
) — 1 mo n~-1 a ee eee © ) 
f(x ~ Cit ee, (Ugx U,X Un), 
where the uw; are integers. If w is the greatest common divisor of Up, U,,..., Up, 


then each u; = wa;, where ay, a,,..., a, are relatively prime integers. Then 


Ww is 
FX) = Gee (@ox” + ayxP' + +++ + a,); 


canceling out the greatest common factor of w and coc, --- Cc, gives us 


Uu 
=—_— n + eoee + 
f(x) = (ax a,), 
where u, m are relatively prime integers, as is claimed in the lemma. L] 


The next result is a result about a particular homomorphic image of 
R[x] for any ring R. 


Lemma 4.6.2. If R is any ring and / an ideal of R, then /[x], the poly- 
nomial ring in x over J, is an ideal of R[x]. Furthermore, R[x]/7[x] = (R/D[x], 
the polynomial ring in x over R/T. 


Proof. Let R = R/I; then there is a homomorphism ¢: R —> R, defined 
by y(a) = a + I, whose kernel is J. Define ® : R[x] — R[x] by: If 


F(x) = agx" + aux”) + +++ + 4,, 
then 
D(f(x)) = P(ao)x" + p(a,)x""! + +++ + G(a,). 


We leave it to the reader to prove that ® is a homomorphism of R[x] onto 
R[x]. What is the kernel, Ker ®, of ©? If f(x) = agx” +--+ + a, is in Ker ®, 
then ®( f(x)) = 0, the 0 element of R [x]. Since 


D(f (x)) = G(ao)x" + p(a,)x”"! + +++ + e(a,) = 0, 


we conclude y(a)) = 0, y(a,) = 0, ... , g(a,) = 0, by the very definition of 
what we mean by the 0-polynomial in a polynomial ring. Thus each q; is in 
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the kernel of y, which happens to be J. Because ap, a;,..., a, are in J, f(x) = 
ayx" + ayx"| +--+ +a, isin I[x]. So Ker ® C J [x]. That J[x] C Ker ® is im- 
mediate from the definition of the mapping ®. Hence /[x] = Ker ®. By the 
First Homomorphism Theorem (Theorem 4.3.3), the ring /[x] is then an 
ideal of R[x] and R[x] = R[x]/Ker ® = R[x]/I[x]. This proves the lemma, re- 
membering that R = R/I. 0 


As a very special case of the lemma we have the 


Corollary. Let Z be the ring of integers, p a prime number in Z, and 
I = (p), the ideal of Z generated by p. Then Z[x]/I[x] = Z,[x]. 


Proof. Since Z, = Z/I, the corollary follows by applying the lemma to 
R=2Z.0] 


We are ready to prove the first of the two major results we seek in this 
section. 


Theorem 4.6.3 (Gauss’ Lemma). If f(x) € Z[x] is a monic polynomial 
and f(x) = a(x)b(x), where a(x) and b(x) are in Q[x]. Then f(x) = 
a,(x)b,(x), where a,(x), b,(x) are monic polynomials in Z[x| and deg a,(x) = 
deg a(x), deg b,(x) = deg b(x). 


Proof. Suppose that f(x) € Z[x] and that f(x) = a(x)b(x), where a(x), 
b(x) € Q[x], and deg a(x) = s, deg b(x) = r. By Lemma 4.6.1, we can ex- 
press each of a(x), b(x) as a product of a rational number and a polynomial 
with integer coefficients. More precisely, 


= Uy 1.8 | el ae PN oe Uy 
a(x) Hii (agx* + ajx** + + a;) i, a,(x), 
where ao, @j,..., a, are relatively prime integers and 
— U2 oe i coral byes ty Py Ug 
b(x) ie (box’ + Bix’ + + b/) as b,(x), 
where by, bj,..., b; are relatively prime. Thus 
U,u U 
f(x) = a(x)b(x) = ae a,(x)b,(x) = yy Alx)bi (x), 


where v and w are relatively prime, by canceling out the common factor of 
uU,u, and m,m,. Therefore, wf(x) = va,(x)b,(x), and f(x), a,(x), b,(x) are 
all in Z{|x]|. Of course, we may assume with no loss of generality that the lead- 
ing coefficients of a,;(x) and 5,(x) are positive. 

If w = 1, then, since f(x) is monic, we get that vagb) = 1 and this leads 
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easily to v = 1, ag = bo = 1 and so f(x) = a; (x)b,(x), where both a,(x) and 
b,(x) are monic polynomials with integer coefficients. This is precisely the 
claim of the theorem, since deg a,(x) = deg a(x) and deg b,(x) = deg b(x). 

Suppose then that w # 1; thus there is a prime p such that p|w and, 
since (v, w) = 1, piv. Also, since the coefficients aj, aj,..., a, of a,(x) 
are relatively prime, there is an i such that p{a}; similarly, there is a j 
such that p{b;. Let I = (p) be the ideal generated by p in Z; then 
ZII = Z, and, by the Corollary to Lemma 4.6.2, Z[x]/I[x] ~ Z,[x], so is an in- 
tegral domain. However, since p|w, W, the image of w in Z[x]/I[x], is 0, 
and since p|v, v the image of v in Z[x]/I[x] is not 0. Thus Of (x) = 
U @,(x)b,(x), where 0 # 0 and a,(x) #0, b,(x) # 0 because p} a! and p |b; for 
the given i, j above. This contradicts that Z[x]/J[x] is an integral domain. So 
w # 1 is not possible, and the theorem is proved. [_] 


It might be instructive for the reader to try to show directly that if 
x? + 6x — 7 is the product of two polynomials having rational coefficients, 
then it is already the product of two monic polynomials with integer coeffi- 
cients. 


One should say something about C. F. Gauss (1777-1855), considered by many 
to be the greatest mathematician ever. His contributions in number theory, al- 
gebra, geometry, and so on, are of gigantic proportions. His contributions in 
physics and astronomy are also of such a great proportion that he is considered 
by physicists as one of their greats, and by the astronomers as one of the impor- 
tant astronomers of the past. 


As we indicated at the beginning of this section, irreducible polynomi- 
als of degree n over a given field F may be very hard to come by. However, 
over the rationals, due to the next theorem, these exist in abundance and are 
very easy to construct. 


Theorem 4.6.4 (The Eisenstein Criterion). Let f(x) = x” + a,x"! 
+ +++ + a, be a nonconstant polynomial with integer coefficients. Suppose 
that there is some prime p such that p|a,,p|a,,...,p|a,, but p*}a,. Then 
f(x) is irreducible in Q[x]. 


Proof. Suppose that f(x) = u(x)uv(x), where u(x), v(x) are of positive 
degree and are polynomials in Q[x]. By Gauss’ Lemma we may assume that 
both u(x) and u(x) are monic polynomials with integer coefficients. Let 
I = (p) be the ideal generated by p in Z, and consider Z[x]/I[x], which is an 
integral domain, since we know by the Corollary to Lemma 4.6.2 that 
Z[x]/I[x] = (Z/D)[x] = Z,[x]. The image of f(x) = x" + ayx"! +--+ +a, in 
Z[x]/I[x] is x”, since p|a,,..., p | a,. So if U(x) is the image of u(x) and 0(x) 
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that of v(x) in Z[x]/I [x], then x” = a(x)v(x). Since U(x)|x”, 0(x)|x”" in Z[x]/T [x], 
we must have that u(x) = x’, U(x) = x”-" for some 1 = r < n. But then 
u(x) = x’ + pg(x) and v(x) = x”"” + ph(x), where g(x) and h(x) are polyno- 
mials with integer coefficients. Since u(x)u(x) = x” + px’h(x) + px" "g(x) + 
p’g(x)h(x), and since 1 < r < n, the constant term of u(x)u(x) is p’st, where 
s is the constant term of g(x) and ¢ the constant term of h(x). Because f(x) = 
u(x)u(x), their constant terms are equal, hence a,, = p’st. Since s and tare in- 
tegers, we get that p* | a,,, a contradiction. In this way we see that f(x) is irre- 
ducible. L] 


We give some examples of the use to which the Eisenstein Criterion 
can be put. 


1. Let f(x) = x" — p, p any prime. Then one sees at a glance that f(x) is 
irreducible in Q[x], for the Eisenstein Criterion applies. 

2. Let f(x) = x° — 4x + 22. Since 2 | 22, 2? / 22 and 2 divides the other rel- 
evant coefficients of f(x), the Eisenstein Criterion tells us that f(x) is ir- 
reducible in Q[x]. 

3. Let f(x) = x!! — 6x* + 12x37 + 36x — 6. We see that f(x) is irreducible 
in Q[x] by using either 2 or 3 to check the conditions of the Eisenstein 
Criterion. 


4. Let f(x) = 5x* — 7x + 7; f(x) is not monic, but we can modify f(x) 
slightly to be in a position where we can apply the Eisenstein Criterion. 
Let 


g(x) = Sf (x) = 54x* — 7-59x + 7-5? = (5x)* — 175(5x) + 875. 


If we let y = 5x, then g(x) = h(y) = y* — 175y + 875. The polynomial 
h(y) is irreducible in Z[y] by using the prime 7 and applying the Eisen- 
stein Criterion. The irreducibility of i(y) implies that of g(x), and so 
that of f(x), in Q[x]. 

This suggests a slight generalization of the Eisenstein Criterion to 
nonmonic polynomials. (See Problem 4.) 


5. Let f(x) = x* + x° + x? + x + 1; as it stands we cannot, of course, 
apply the Eisenstein Criterion to f(x). We pass to a polynomial g(x) 
closely related to f(x) whose irreducibility in Q[x] will ensure that of 
f(x). Let g(x) = f(x +1) =(* +1 +4419 +(e 4+1% + (+1) + 
1 = x* + 5x° + 10x* + 10x + 5. The Eisenstein Criterion applies to 
g(x), using the prime 5; thus g(x) is irreducible in Q[x]. This implies 
that f(x) is irreducible in Q[x]. (See Problem 1.) 
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Gotthold Eisenstein (1823-1852) in his short life made fundamental contribu- 
tions in algebra and analysis. 


PROBLEMS 


. In Example 5, show that because g(x) is irreducible in Q[x], then so is 


f(x). 


2. Prove that f(x) = x? + 3x + 2 is irreducible in Q[x]. 


3. Show that there is an infinite number of integers a such that f(x) = 


10. 


11. 


x’ + 15x* — 30x + ais irreducible in Q[x]. What a’s do you suggest? 


. Prove the following generalization of the Eisenstein Criterion. Let 


f(x) = agx" + ayx"! +--+ + a, have integer coefficients and suppose 
that there is a prime p such that 


play, pla,, p|ar,---,P|Qy-1, P| Qn, 


but p’/a,,; then f(x) is irreducible in Q[x]. 


. If p is a prime, show that f(x) = x?! + x?-* +---+ x + 1 is irreducible 


in Q[x]. 


. Let F be the field and g an automorphism of F[x] such that g(a) = a for 


every a € F. If f(x) € F[x], prove that f(x) is irreducible in F[x] if and 
only if g(x) = o(f(x)) is. 


. Let F be a field. Define the mapping 


gp: F[x] > F[x] by o(f(x)) = f( + 1) 


for every f(x) € F[x]. Prove that ~ is an automorphism of F[x] such that 
p(a) = a for everya € F. 


. Let F be a field and b # O an element of F. Define the mapping 


gp: F[x] — F[x] by ¢( f(x)) = f(bx) for every f(x) € F[x]. Prove that ¢ is 
an automorphism of F[x] such that g(a) = a for everya € F. 


. Let F be a field, b # 0, c elements of F. Define the mapping 


gp: F[x] — F[x] by g(f(x)) = f(bx + c) for every f(x) € F[x]. Prove that 
g is an automorphism of F[x] such that g(a) = a for every a € F. 

Let ¢ be an automorphism of F[x], where F is a field, such that g(a) = a 
for every a € F. Prove that if f(x) € F[x], then deg y( f(x)) = deg f(x). 
Let ¢ be an automorphism of F[x], where F is a field, such that g(a) = a 
for every a € F. Prove there exist b # 0, c in F such that g(f(x)) = 
f (bx + c) for every f(x) € F[x]. 
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12. Find a nonidentity automorphism g of Q[x] such that ¢’ is the identity 
automorphism of QJ[x]. 


13. Show that in Problem 12 you do not need the assumption g(a) = a for 
every a € @ because any automorphism of Q[x] automatically satisfies 
y(a) = a for every a € Q. 


14. Let C be the field of complex numbers. Given an integer n > 0, exhibit 
an automorphism ¢ of C[x] of order n. 


7. FIELD OF QUOTIENTS OF AN INTEGRAL DOMAIN 


Given the integral domain Z, the ring of integers, then intimately related to Z 
is the field @ of rational numbers that consists of all fractions of integers; that 
is, all quotients m/n, where m, n # 0 are in Z. Note that there is no unique 
way of representing 5, say, in @ because $ = 7 = (—7)/(-14) =-:-. 
In other words, we are identifying $ with 4, (—7)/(—14), and so on. This sug- 
gests that what is really going on in constructing the rationals from the inte- 
gers is some equivalence relation on some set based on the integers. 

The relation of Q to Z carries over to any integral domain D. Given an 
integral domain D, we shall construct a field F > D whose elements will be 
quotients a/b with a, b € D, b # 0. We go through this construction formally. 

Let D be an integral domain, and let S = {(a, b)| a, b € D, b # 0}; Sis 
thus the subset of D x D—the Cartesian product of D with itself—in which 
the second component is not allowed to be 0. Think of (a, b) as a/b for a mo- 
ment; if so, when would we want to declare that (a, b) = (c, d)? Clearly, we 
would want this if a/b = c/d, which in D itself would become ad = bc. With 
this as our motivating guide we define a relation ~ on S by declaring: 


(a,b) ~ (c,d) for (a, b), (c, d) in S, if and only if ad = be. 


We first assert that this defines an equivalence relation on S. We go 
through the three requirements for an equivalence relation term by term. 


1. (a, b) ~ (a, b), for clearly ab = ba (since D is commutative). So ~ is 
reflexive. 

2. (a, b) ~ (c, d) implies that (c, d) ~ (a, b), for (a, b) ~ (c, d) means 
ad = bc; for (c, d) ~ (a, b) we need cb = da, but this is true, since 
cb = bc = ad = da. So ~ is symmetric. 

3. (a, b) ~ (c, d), (c, d) ~ (e, f) implies that ad = bc, cf = de, so adf = 
bcf = bde; but d # 0 and we are in an integral domain, hence af = 
be follows. This says that (a, b) ~ (e, f). So the relation is transitive. 
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We have shown that ~ defines an equivalence relation on S. Let F be 
the set of all the equivalence classes [a, b] of the elements (a, b) € S. F is our 
required field. 

To show that F is a field, we must endow it with an addition and multi- 
plication. First the addition; what should it be? Forgetting all the fancy talk 
about equivalence relation and the like, we really want [a, b] to be a/b. If so, 
what should [a, b] + [c, d] be other than the formally calculated 


_ ad + be, 


G 
ba 


& 
b 
This motivates us to define 

[a, b] + [c,d] = [ad + bc, bd}. (1) 


Note that since b # 0, d # O and D is a domain, then bd # 0, hence 
[ad + bc, bd] is a legitimate element of F. 

As usual we are plagued with having to show that the addition so de- 
fined in F is well-defined. In other words, we must show that if [a, b] = 
[a’, b'] and [c, d] = [c’, d'], then [a, b] + [c, d] = [a’, b'] + [c', d']. From (1) 
we must thus show that [ad + bc, bd] = [a'd’ + b'c’, b'd'|, which is to say, 
(ad + bc)b'd' = bd(a'd' + b'c'). Since [a, b] = [a’, b'] and [c, d] = [c’, d'], 
ab' = ba' and cd' = dc’. Therefore, (ad + bc)b'd' = ab'dd' + bb'cd' = 
ba'dd' + bb' dc' = (a'd' + b'c')bd, as required. Thus “+” is well-defined in F. 

The class [0, b], b # 0, acts as the 0 under “+,” we denote it simply as 0, 
and the class [—a, b] is the negative of [a, b]. To see that this makes of F an 
abelian group is easy, but laborious, for all that really needs verification is 
the associative law. 

Now to the multiplication in F. Again motivated by thinking of [a, b] as 
alb, we define 


[a, b][c, d] = [ac, bd}. (2) 


Again since b # 0, d # 0, we have bd # 0, so the element [ac, bd] is 
also a legitimate element of F. 

As for the “+” we must show that the product so introduced 1s well-de- 
fined; that is, if [a, b] = [a’, b'], [c, d] = [c', d’], then 


[a, b][c, d] = [ac, bd] = [a’c’,b’d’| = [a’,b’][c’, d’]. 


We know that ab’ = ba’ and cd’ = dc',so acb'd' = ab'cd' = ba'dc’ = bda'c', 
which is exactly what we need for [ac, bd] = [a'c', b’d'|. Thus the product is 
well-defined in F. 

What acts as 1 in F? We claim that for any a # 0, b # 0 in D, [a, a] = 
[b, b] (since ab = ba) and [c, d][a, a] = [ca, da] = [c, d], since (ca)d = (da)c. 
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So [a, a] acts as 1, and we write it simply as 1 = [a, a] (for all a # 0 in D). 
Given [a, b] # 0, then a # 0, so [b, a] is in F; hence, because [a, b][b, a] = 
[ab, ba] = [ab, ab] = 1, [a, b| has an inverse in F. All that remains to show 
that the nonzero elements of F form an abelian group under this product is 
the associative law and commutative law. We leave these to the reader. 

To clinch that F is a field, we need only now show the distributive law. 
But [ad + bc, bd][e, f] = [(ad + bc)e, bdf |, so 


({a, b] + [c, d])[e, f] = [ade + bce, bdf |, 
while 
[a, bJle, f] + [e, 4 ]le, f] 
= [ae, bf | + [ce, df] = [aedf + bfce, bdf*] 
= [(ade + bce) f, bdf?| = [ade + bce, bdf |[ f, f] 
= [ade + bce, bdf |, 


which we have seen is ({a, b] + [c, d])[e, f]. The distributive law is now es- 
tablished, so F 1s a field. 

Let a # 0 be a fixed element in D and consider [da, a] for any d € D. 
The map ¢:d — [da, a] is a monomorphism of D into F. It is certainly 1-1, 
for if g(d) = [da, a] = 0, then da = 0, so d = 0, since D is an integral do- 
main. Also, g(d,d,) = [d,d,a, a] while g(d;)p(d,) = [d,a, a|[d,a, a] = 
[d,d,a’, a*| = [d,d,a, al[a, a] = [d,d,a, a] = ¢(d,d,). Furthermore, 


[d,a,a] + [d,a,a] = [d,a* + a*d,, a’] 
= |[d,a + d,a, alfa, al 
= [(d, a d>) a, a| 


so g(d; + dz) = [(d; + d;)a, a] = [d,a, a] + [d,a, a] = g(d,) + ¢(d2). Thus 
g maps D monomorphically into F. So, D is isomorphic to a subring of F, and 
we can thus consider D as “embedded” in F. We consider every element 
[a, b] of F as the fraction a/b. 


Theorem 4.7.1. Let D be an integral domain. Then there exists a field 
F D D which consists of all fractions a/b, as defined above, of elements in D. 


The field F is called the field of quotients of D. When D = Z, then F is 
isomorphic to the field @ of rational numbers. Also, if D is the domain of 
even integers, then F is also the entire field Q. 

What we did above in constructing the field of quotients of D was a 
long, formal, wordy, and probably dull way of doing something that is by its 
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nature something very simple. We really are doing nothing more than form- 
ing all formal fractions a/b, a, b # 0 in D, where we add and multiply frac- 
tions as usual. However, it is sometimes necessary to see something done to 
its last detail, painful though it may be. Most of us had never seen a really 
formal and precise construction of the rationals from the integers. Now 
that we have constructed F from D in this formal manner, forget the de- 
tails and think of F as the set of all fractions of elements of D. 


PROBLEMS 


1. Prove the associative law of addition in F. 

2. Prove the commutative law of addition in F. 

3. Prove that the product in F is commutative and associative. 

4. If K is any field that contains D, show that K D F. (So F is the smallest 
field that contains D.) 


5 


FIELDS 


The notion of a ring was unfamiliar territory for most of us; that of a field 
touches more closely on our experience. While the only ring, other than a 
field, that we might have seen in our early training was the ring of integers, 
we had a bit more experience working with rational numbers, real numbers, 
and, some of us, complex numbers, in solving linear and quadratic equations. 
The ability to divide by nonzero elements gave us a bit of leeway, which we 
might not have had with the integers, in solving a variety of problems. 

So at first glance, when we start working with fields we feel that we are 
on home ground. As we penetrate deeper into the subject, we start running 
across new ideas and new areas of results. Once again we’ll be in unfamiliar 
territory, but hopefully, after some exposure to the topic, the notions will be- 
come natural to us. 

Fields play an important role in geometry, in the theory of equations, 
and in certain very important parts of number theory. We shall touch upon 
each of these aspects as we progress. Unfortunately, because of the technical 
machinery we would need to develop, we do not go into Galois theory, a very 
beautiful part of the subject. We hope that many of the readers will make con- 
tact with Galois theory, and beyond, in their subsequent mathematical training. 


1. EXAMPLES OF FIELDS 
Let’s recall that a field F is a commutative ring with unit element 1 such that for 
every nonzero a € F there is an element a™' € F such that aa! = 1. In other 


words, fields are “something like” the rationals Q. But are they really? The in- 
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tegers mod p, Z,, where p is a prime, form a field; in Z, we have the relation 


pl=1+1+-:-+1=0 
(p times) 
Nothing akin to this happens in @. There are even sharper differences 
among fields—how polynomials factor over them, special properties of which 
we'll see in some examples, and so on. 
We begin with some familiar examples. 


Examples 


1. Q, the field of rational numbers. 

2. IR, the field of real numbers. 

3. C, the field of complex numbers. 

4. Let F= {a + bila, b € Q} CC. To see that F is a field is relatively easy. 
We only verify that if a + bi # 0 is in F, then (a + bi)~! is also in F. But 
what is (a + bi)” '? It is merely 

a bi 
(a+ b*) (a + b?) 
and since a? + b’ # 0 and is rational, therefore a/(a’ + b’) and also 
b/(a? + b’) are rational, hence (a + bi)~! is indeed in F. 

5. Let F = {a + bV2|a, b € Q} CR. Again the verification that F is a field 
is not too hard. Here, too, we only show the existence of inverses in F for 
the nonzero elements of F. Suppose that a + b\/2 # Ois in F; then, since 
V2 is irrational, a2 — 2b # 0. Because 


(a + bV2)(a — bV2) = a® — 26%, 


we see that (a + bV2)(a/c — V2 bic) = 1, where c = a* — 2b. The re- 
quired inverse for a + bV2 is alc — V2 bic, which is certainly an ele- 
ment of F, since a/c and b/c are rational. 


(verify!), 


6. Let F be any field and let F[x] be the ring of polynomials in x over F. 
Since F[x] is an integral domain, it has a field of quotients according to 
Theorem 4.7.1, which consists of all quotients f(x)/g(x), where f(x) and 
g(x) are in F[x] and g(x) # 0. This field of quotients of F[x] is denoted by 
F(x) and is called the field of rational functions in x over F. 

7. Z,, the integers modulo the prime p, is a (finite) field. 

8. In Example 2 in Section 4 of Chapter 4 we saw how to construct a field 
having nine elements. 


These eight examples are specific ones. Using the theorems we 
have proved earlier, we have some general constructions of fields. 
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9. If D is any integeral domain, then it has a field of quotients, by Theorem 
4.7.1, which consists of all the fractions a/b, where a and b are in D and 
b #0. 

10. If R is a commutative ring with unit element 1 and M is a maximal ideal 
of R, then Theorem 4.4.2 tells us that R/M is a field. 


This last example, for particular R’s, will play an important role in what 
is to follow in this chapter. 

We could go on, especially with special cases of Examples 9 and 10, to 
see more examples. But the 10 that we did see above show us a certain vari- 
ety of fields, and we see that it is not too hard to run across fields. 

In Examples 7 and 8 the fields are finite. If F is a finite field having q el- 
ements, viewing F merely as an abelian group under its addition, “+,” we 
have, by Theorem 2.4.5, that gx = 0 for every x € F. This is a behavior quite 
distinct from that which happens in the fields that we are used to, such as the 
rationals and reals. 

We single out this kind of behavior in the 


Definition. A field F is said to have (or, to be of) characteristic p # 0 
if for some positive integer p, px = 0 for all x € F, and no positive integer 
smaller than p enjoys this property. 


If a field F is not of characteristic p # O for any positive integer p, we 
call it a field of characteristic 0. So Q, R, C are fields of characteristic 0, while 
Z, 1s of characteristic 3. 

In the definition given above the use of the letter p for the characteris- 
tic is highly suggestive, for we have always used p to denote a prime. In fact, 
as we See in the next theorem, this usage of p is consistent. 


Theorem 5.1.1. The characteristic of a field is either 0 or a prime 
number. 


Proof. If the field F has characteristic 0, there is nothing more to say. 
Suppose then that mx = 0 for all x € F, where m is a positive integer. Let p 
be the smallest positive integer such that px = 0 for all x € F. We claim that 
p is a prime. For if p = uv, where u > 1 and v > 1 are integers, then in F, 
(u1)(v1) = (uv)1 = 0, where 1 is the unit element of F. But F, being a field, is 
an integral domain (Problem 1); therefore, u1 = 0 or v1 = 0. In either case 
we get that 0 = (u1)(x) = ux [or, similarly, 0 = (v1)x = vx] for any x in F. 
But this contradicts our choice of p as the smallest integer with this property. 
Hence p is a prime. 1 
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Note that we did not use the full force of the assumption that F was a 
field. We only needed that F was an integral domain (with 1). So if we define 
the characteristic of an integral domain to be 0 or the smallest positive inte- 
ger p such that px = 0 for all x € F, we obtain the same result. Thus the 


Corollary. If D is an integral domain, then its characteristic is either 0 
or a prime number. 


PROBLEMS 


1. Show that a field is an integral domain. 

2. Prove the Corollary even if D does not have a unit element. 

3. Given a ring R, let S = R[x] be the ring of polynomials in x over R, and 
let T = S[y] be the ring of polynomials in y over S. Show that: 

(a) Any element f(x, y) in T has the form 22a;,x'y/, where the aj, are 
in R. 

(b) In terms of the form of f(x, y) in T given in Part (a), give the condi- 
tion for the equality of two elements f(x, y) and g(x, y) in T. 

(c) In terms of the form for f(x, y) in Part (a), give the formula for 
f(x, y) + g(x, y), for f(x, y), g(x, y) in T. 

(d) Give the form for the product of f(x, y) and g(x, y) if f(x, y) and 
g(x, y) are in T. (T is called the ring of polynomials in two variables 
over R, and is denoted by R[x, y].) 

4. If D is an integral domain, show that D[x, y] is an integral domain. 

5. If F is a field and D = Fx, y], the field of quotients of D is called the 
field of rational functions in two variables over F, and is usually denoted 
by F(x, y). Give the form of the typical element of F(x, y). 

6. Prove that F(x, y) is isomorphic to F(y, x). 

7. If Fis a field of characteristic p + 0, show that (a + b)? = a? + b? for all 
a, b € F. (Hint: Use the binomial theorem and the fact that p is a prime.) 

8. If F is a field of characteristic p + 0, show that (a + b)” = a™ + b”, 
where m = p”", for all a, b in F and any positive integer n. 

9. Let F be a field of characteristic p # 0 and let g: F — F be defined by 
g(a) = a? forallac F. 

(a) Show that g defines a monomorphism of F into itself. 
(b) Give an example of a field F where ¢ is not onto. (Very hard.) 

10. If F is a finite field of characteristic p, show that the mapping ¢ defined 
above is onto, hence is an automorphism of F. 
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2. A BRIEF EXCURSION INTO VECTOR SPACES 


To get into the things we should like to do in field theory, we need some 
technical equipment that we do not have as yet. This concerns the relation of 
two fields K D F and what we would like to consider as some measure of the 
size of K compared to that of F. This size is what we shall call the dimension 
or degree of K over F. 

However, in these considerations, much less is needed of K than that it 
be a field. We would be remiss if we proved these results only for the special 
context of two fields K > F because the same ideas, proofs, and spirit hold in 
a far wider situation. We need the notion of a vector space over a field F. 
Aside from the fact that what we do in vector spaces will be important in our 
situation of fields, the ideas developed appear in all parts of mathematics. 
Students of algebra must see these things at some stage of their training. An 
appropriate place is right here. 


Definition. A vector space V over a field F is an abelian group under 
“+” such that for every a € F and every vu € V there is an element au € V, 
and such that: 


a(u; + v2) = av, + av, fora € F,u,, v2 € V. 

. (a + B)v = av + Bu, fora, BE F,ve V. 

a(Bv) = (aB)u, fora, BE Five V. 

lv = vu for all v € V, where 1 is the unit element of F. 


en eS, 


In discussing vector spaces—which we will do very briefly—we shall 
use lowercase Latin letters for elements of V and lowercase Greek letters for 
elements of F. 

Our basic concern here will be with only one aspect of the theory of vec- 
tor spaces: the notion of the dimension of V over F. We shall develop this no- 
tion as expeditiously as possible, not necessarily in the best or most elegant 
way. We would strongly advise the readers to see the other sides of what is 
done in vector spaces in other books on algebra or linear algebra (for instance, 
our books A Primer on Linear Algebra and Matrix Theory and Linear Algebra). 

Before getting down to some results, we look at some examples. We 
leave to the reader the details of verifying, in each case, that the example 
really is an example of a vector space. 


Examples 


1. Let F be any field and let V = {(a;, a,..., a,) | a; © F for all i} be the set 
of n-tuples over F, with equality and addition defined component-wise. For 
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U = (Q, @,..., @,) and B € F, define Bu by Bu = (Ba,, Ba>,..., Ba,). 
V is a vector space over F. 


2. Let F be any field and let V = F[x] be the ring of polynomials in x over F. 
Forgetting the product of any arbitrary elements of F[x] but using only that 
of a polynomial by a constant, for example, we find that 


Bla + ax + +++ + a,x") = Bag + Bax +--+ + Ba, x”. 
In this way V becomes a vector space over F. 


3. Let V be as in Example 2 and let W = { f(x) € V| deg(f(x)) <n}. Then W 
is a vector space over F, and W C V is a subspace of V in the following sense. 


Definition. A subspace of a vector space V is a nonempty subset W of 
V such that aw € W and w, + w, € W for all a in F and w, w,, w, © W. 


Note that the definition of subspace W of V implies that W is a vector 
space whose operations are just those of V restricted to the elements of W. 


4. Let V be the set of all real-valued differentiable functions on [0, 1], the 
closed unit interval, with the usual addition and multiplication of a function 
by a real number. Then V is a vector space over R. 


5. Let W be all the real-valued continuous functions on [0, 1], again with the 
usual addition and multiplication of a function by a real number. W, too, is a 
vector space over R, and the V in Example 4 is a subspace of W. 


6. Let F be any field, F[x] the ring of polynomials in x over F. Let f(x) 
be in F[x] and J = (f(x)) the ideal of F[x] generated by f(x). Let V = 
F[x|/J, where we define a(g(x) + J) = ag(x) + J. V is then a vector space 
over F. 


7. Let R be the real field and let V be the set of all solutions to the differen- 
tial equation d*y/dx” + y = 0. V is a vector space over R. 


8. Let V be any vector space over a field F, v,, v2,..., U, elements of V. Let 
(Uy, U2,+.+, Un) = {@1V, + anu, +--+ + a,v,| a1, @,..., a, © F}. Then 
(Uj, U2,...,U,) iS a vector space over F and is a subspace of V. This subspace 
(U1, Uz,...,U,) is called the subspace of V generated or spanned by v,,...,0U, 
over F; its elements are called linear combinations of v,,..., v,. We shall 
soon have a great deal to say about (v,, v2,..., U,). 


9. Let V and W be vector spaces over the field F and let V @® W = 
{(v, w)|v € V, w € W}, with equality and addition defined componentwise, 
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and where a(v, w) = (av, aw). Then V @ W is easily seen to be a vector 
space over F; it is called the direct sum of V and W. 


10. Let K > F be two fields, where the addition “+” is that of K and where 
av, for a € Fand v € K is the product, as elements of the field K. Then Con- 
ditions 1 and 2 defining a vector space are merely special cases of the distrib- 
utive laws that hold in K, and Condition 3 is merely a consequence of the 
associativity of the product in K. Finally, Condition 4 is just the restate- 
ment of the fact that 1 is the unit element of K. So K 1s a vector space 
over F. 


In one respect there is a sharp difference among these examples. We 
specify this difference by examining some of these examples in turn. 


1. In Example 1, if 
v, = (1,0,...,0), v, = (0,1,0,...,0),...,u, = (0,0,...,1), 


then every element v in V has a unique representation in the form v = 


QU, +++: + a,v,, Where a,,..., a, are in F. 

2. In Example 3, if vy; = 1,0, =x, ... ,u; =x" ', ... ,U,4, =x", then every 
vu € V has a unique representation as v = a,v, + --: + a@,v,, with the a; 
in F. 


3. In Example 7, every solution of d?y/dx* + y = 0 is of the unique form 
y =acosx + Bsin x, with a and B real. 

4. In Example 8, every v € (u,,..., u,,) has a representation—albeit not nec- 
essarily unique—aS U = @,U, + -:: + a,v, from the very definition of 
(U;,...,U,). Uniqueness of this representation depends heavily on the ele- 
ments U,,...,U,. 

5. In the special case of Example 10, where K = C, the field of complex 
numbers, and F = R that of the real numbers, then every vu € C is of the 
unique formu = a+ Bi, a, BER. 

6. Consider K = F(x) D F, the field of rational functions in x over F. We 
claim—and leave to the reader—that we cannot find any finite set of ele- 
ments in K which spans K over F. This phenomenon was also true in some 
of the other examples we gave of vector spaces. 


The whole focus of our attention here will be on this notion of a vector 
space having some finite subset that spans it over the base field. 

Before starting this discussion, we must first dispose of a list of formal 
properties that hold in a vector space. You, dear reader, are by now so 
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sophisticated in dealing with these formal, abstract things that we leave the 
proof of the next lemma to you. 


Lemma 5.2.1. If V is a vector space over the field F, then, for every 
a € Fand everyvu € V: 
(a) a0 = 0, where 0 is the zero-element of V. 
(b) Ov = 0, where the first 0 is the zero in F. 
(c) av = O implies that a = Oorv = 0. 


(d) (—a@)uv = —(av). 


In view of this lemma we shall not run into any confusion if we use the 
symbol 0 both for the zero of F and that of V. 

We forget vector spaces for a moment and look at solutions of certain 
systems of linear equations in fields. Take, for example, the two linear homo- 
geneous equations with real coefficients, x, + x, + x3 = O and 3x, —x,+ x; 
= 0. We easily see that for any x,, x3 such that 4x, + 2x, = 0 and x, = 
—(x, + x3), we get a solution to this system. In fact, there exists an infinity of 
solutions to this system other than the trivial one x, = 0, x, = 0, x3 = 0. If we 
look at this example and ask ourselves: “Why is there an infinity of solutions 
to this system of linear equations?”, we quickly come to the conclusion that, 
because there are more variables than equations, we have room to maneuver 
to produce solutions. This is exactly the situation that holds more generally, 
as we see below. 


Definition. Let F be a field; then the n-tuple (B,,..., B,,), where the 
B; are in F, and not all of them are 0, is said to be a nontrivial solution in F to 
the system of homogeneous linear equations 


14 1%X4 + Q112X2 + oO fey - Qin Xx, = @) 
Op 1X4 a A7272X2 tere + Qn Xn ea @) 
(*) = 
AiyX1 + Aji2X2 tree + AinXn = 0 
1X4 a5 A,2X2 ap ASE ArnXn = 0 
where the a;, are all in F, if substituting x, = B), ... ,x, = B, Satisfies all 


the equations of (*). 


For such a system (*) we have the following 
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Theorem 5.2.2. If n > r, that is, if the number of variables (un- 
knowns) exceeds the number of equations in (*), then (*) has a nontrivial so- 
lution in F. 


Proof. The method is that, which some of us learned in high school, of 
solving simultaneous equations by eliminating one of the unknowns and at 
the same time cutting the number of equations down by one. 

We proceed by induction on r, the number of equations. If r = 1, the 
system (*) reduces to a,,x, + ++: + a,x, = 0, andn > 1. If all the a,; = 0, 


then x, = x, =:+-:: =x, = 11s a nontrivial solution to (*). So, on renumber- 
ing, we may assume that a,, # 0; we then have the solution to (*), which is 
nontrivial: x, =--:- =x, = land x, = —(1/a;;)(@j. + ++: + a4,). 


Suppose that the result is correct for r = k for some k and suppose that 
(*) is a system of k + 1 linear homogeneous equations in n > k + 1 vari- 
ables. As above, we may assume that some a;; # 0, and, without loss of gen- 
erality, that a,, # 0. 

We construct a related system, (**), of k linear homogeneous equations 
in n — 1 variables; since n > k + 1, we have that n — 1 > k, so we can apply 
induction to this new system (**). How do we get this new system? We want 
to eliminate x, among the equations. We do so by subtracting a;,/a,, times 
the first equation from the ith one for each of i = 2, 3,..., k + 1. In doing 
so, we end up with the new system of k linear homogeneous equations in 
n — 1 variables: 


B22 X2 He, OA AB, = 0 

B32X2 eee Se Ba, = 0 
(**) 

Bu +1,2%2 ees ee Bu+i.nXn = 0, 


where B;; = @;; — @);/@,, fori = 2,3,...,k + landj = 2,3,...,n. 
Since (**) is a system of k linear homogeneous equations in n — 1 vari- 


ables and n — 1 > k, by our induction (**) has a nontrivial solution (7, ..., Y,) 
in F. Let y; = —(@i2y2 + +++ + QinY,)/@11; We leave it to the reader to verify 
that the (7,, y2,.-., Y,) so obtained is a required nontrivial solution to (*). 


This completes the induction and so proves the theorem. | 


With this result established, we are free to use it in our study of vector 
spaces. We now return to these spaces. We repeat, for emphasis, something 
we defined earlier in Example 8. 
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Definition. Let V be a vector space over F and let v,,v,,..., vu, be in 
V. The element vu € V is said to be a linear combination of U,, U2,..., U, if 
VU = ay, + -::+a,v, for some a;,--: a, in F. 

As we indicated in Example 8, the set (v,, U,,..., U,,) of all linear com- 
binations of v,, V2,..., U, 1S a vector space over F, and being contained in V, 
is a subspace of V. Why is it a vector space? If a,v; + -:: + a,v, and 
Biv, +++: + B,v, are two linear combinations of v,,...,u,, then 

(av, eae QU n) Bs (B,v1 Se iS “ae B,Un) 


= (a + B,)v, nis Sa (a,, + B,) v,, 


by the axioms defining a vector space, and so is in (v,,...,U,). If y © F and 
QU, +-:: + a,v, € (U,,...,U,), then 


y (ayv, i a ate ApVp ) = ya,U,; +--+: + ya, vU,, 


and is also in (v,,...,U,,). Thus (v,,...,U,,) iS a vector space. As we called it 
earlier, it is the subspace of V spanned over F by Uj,..., U,. 
This leads us to the ultra-important definition. 


Definition. The vector space V over F is finite dimensional over F if 
V =(u,,...,0U,) for some U,,..., U, in V, that is, if V is spanned over F by a 
finite set of elements. 


Otherwise, we say that V is infinite dimensional over F if it is not finite 
dimensional over F. Note that although we have defined what is meant by a 
finite-dimensional vector space, we still have not defined what is meant by its 
dimension. That will come in due course. 


Suppose that V is a vector space over F and u,,..., v, in V are such 
that every element v in (U,,..., U,,) has a unique representation in the form 
U = a,U, +++: + a,v,, where a,,...,a, © F. Since 

0€ (u,,...,U,) and O:= Ou, a 2 OU: 5 
by the uniqueness we have assumed we obtain that if a,v; + --- + a,v, = 0, 
then a; = a, = ::: = a, = 0. This prompts a second ultra-important defini- 
tion. 

Definition. Let V be a vector space over F’; then the elements v,,...,v, 
in V are said to be linearly independent over F if a,v; + ++: + a,v, = 9, 


where a,,..., a, are in F, implies that a; = a, =--: =a, = 0. 
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If the elements v,,..., uv, in V are not linearly independent over F, 
then we say that they are linearly dependent over F. For example, if R is the 
field of real numbers and V is the set of 3-tuples over R as defined in Exam- 
ple 1, then (0, 0, 1), (0, 1, 0), and (1, 0, 0) are linearly independent over R 
(Prove!) while (1, —2, 7), (0, 1, 0), and (1, —3, 7) are linearly dependent over 
R, since 1(1, -2, 7) + (—1)(0, 1, 0) + (—1)(1, —3, 7) = (0, 0, 0) is a nontrivial 
linear combination of these elements over R, which is the 0-vector. 

Note that linear independence depends on the field F. If C D R are the 
complex and real fields, respectively, then C is a vector space over R, but it is 
also a vector space over C itself. The elements 1,7 in C are linearly indepen- 
dent over R but are not so over C, since i1 + (—1)i = 0 is a nontrivial linear 
combination of 1, i over C. 


We prove 
Lemma 5.2.3. If V is a vector space over F and v,,...,v,,in V are lin- 
early independent over F, then every element uv € (u,,..., u,,) has a unique 


representation as 


U = ay, + °°: + av 


n 


with a,,...,a@, inF. 


Proof. Suppose that v € (u,,..., U,) has the two representations as 
v= ay +::: + a,v, = Biv, +++: + B,v, with the a’s and f’s in F. This 
gives us that (a, — B,)v; + --: + (a, — B,)U, = 0; since v,,..., U, are lin- 
early independent over F, we conclude that a, — B, = 0, ... ,a, — B, = 9, 
yielding for us the uniqueness of the representation. [_] 


How finite is a finite-dimensional vector space? To measure this, call a 
subset v,,...,U, of V a minimal generating set for V over F if V = (u,,...,U,) 
and no set of fewer than 1 elements spans V over F. 

We now come to the third vitally important definition. 


Definition. If V is a finite-dimensional vector space over F, then the 
dimension of V over F, written dim;(V), is n, the number of elements in a 
minimal generating set for V over F. 


In the examples given, dimg(C) = 2, since 1, i is a minimal generating 
set for C over R. However, dime(C) = 1. In Example 1, dim;(V) = n and in 
Example 3, dim;(V) = n + 1. In Example 7 the dimension of V over F is 2. 
Finally, if (v,,...,u,) C V, then dim;(v,,..., u,,) is at most n. 

We now prove 
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Lemma 5.2.4. If V is finite dimensional over F of dimension n and if 
the elements v,,..., vu, of V generate V over F, then v,,..., v,, are linearly 
independent over F. 


Proof. Suppose that v;,...,U, are linearly dependent over F; thus there 
is a linear combination a,v, + --: + a,vU, = 0, where not all the a; are 0. We 
may suppose, without loss of generality, that a, # 0; then v, = 


(—1/a,)(a,v, + ++: + a@,v,). Given v € V, because v;,..., U, iS a generating 
set for V over F, 
D= Burt +++ + BP, 


= i (av, + +++ + a@,v,) + By. + +++ + BVA; 


thus v,,..., uv, span V over F, contradicting that the subset v,, v2,...,vU, 18 a 
minimal generating set of V over F. L 


We now come to yet another important definition. 


Definition. Let V be a finite-dimensional vector space over F; then 
U;,...,U, 18 a basis of V over F if the elements v,,..., vu, span V over F and 
are linearly independent over F. 


By Lemma 5.2.4 any minimal generating set of V over F is a basis of V 
over F’. Thus, finite-dimensional vector spaces have bases. 


Theorem 5.2.5. Suppose that V is finite dimensional over F. Then any two 
bases of V over F have the same number of elements, and this number is ex- 
actly dim; (V ). 


Proof. Let v,,..., U, and w;,..., w,, be two bases of V over F. We 
want to show that m = n. Suppose that m > n. Because v,,..., U, is a basis 
of V over F, we know that every element in V ts a linear combination of the 
vu; over F. In particular, w,,..., w,, are each a linear combination of v,,..., v, 
over F. Thus we have 


Wi = 1101 + Q12U2 tere t+ Wn 
Wo = p10, i Qn Un +eee + QU yp 
Wm 7 Am1U 4 a Am2U2 aie eS Amn 
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where the q@;; are in F. 
Consider 


ByW, tet + ByWm = (a1;B; + Q)By t+ + Qn1 Bm) U,+ 


+ (21,8; Bg Q> p, Bo Tee ae AmnBm) Un: 
The system of linear homogeneous equations 
@4;B, + a2;B. +++ ° + @yi:B, = 9, £=1,2,...,N, 


has a nontrivial solution in F by Theorem 5.2.2, since the number of vari- 


ables, m, exceeds the number of equations, n. If B,,..., 8,, is such a solution 
in F, then, by the above, B,w, + -:-: + B,,w,, = 0, yet not all the B; are 0. This 
contradicts the linear independence of w,,..., w,, over F. Therefore, m <n. 


Similarly, 1m = m; hence m = n. The theorem is now proved, since a minimal 
generating set of V over F is a basis of V over F and the number of elements 
in this minimal generating set is dim;(V), by definition. Therefore, by the 
above, n = dim;(V), completing the proof. _] 


A further result, which we shall use in field theory, of a similar nature 
to the things we have been doing is 


Theorem 5.2.6. Let V be a vector space over F such that dim;(V) = 
n. If m > n, then any m elements of V are linearly dependent over F. 


Proof. Let w,,..., w,, © V and let v;,..., vu, be a basis of V over F; 
here n = dim;(V) by Theorem 5.2.5. Therefore, 
Wi = 14104 Sie aes MWh; oo Wi = A141 2 Ces we Qin’ as 


The proof given in Theorem 5.2.5, that if m > n we can find B,,..., 8, n F, 
and not all 0, such that B,w, + --- + B,,w,, = 0, goes over word for word. 
But this establishes that w,,..., w,, are linearly dependent over F. L] 


We close this section with a final theorem of the same flavor as the pre- 
ceding ones. 


Theorem 5.2.7. Let V be a vector space over F with dim;(V) = n. 
Then any n linearly independent elements of V form a basis of V over F. 
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Proof. We want to show that if v,,...,v, © V are linearly independent 
over F, then they span V over F. Let v € V; then uv, v,,...,v, aren + 1 ele- 
ments, hence, by Theorem 5.2.6, they are linearly dependent over F. Thus 
there exist elements a, a,,..., a, in F, not all 0, such that av + a,v, +--+: + 
a,U, = 0. The element a cannot be 0, otherwise a,v, + --- + a,v, = 0, and 
not all the a; are 0. This would contradict the linear independence of the ele- 
ments v,,...,U, over F. Thus a # 0, and so v = (—1/a)(a,v, +--+ + a,v,) = 
Byv, +--+: + B,v,, where B; = —a;/a,. Therefore, v;,..., vu, span V over F, 
and thus must form a basis of V over F. C1] 


PROBLEMS 
Easier Problems 


1 Determine if the following elements in V, the vector space of 3-tuples 
over R, are linearly independent over R. 
(a) (1, 2, 3), (4, 5, 6), (7, 8, 9). 
(b) (1, 0, 1), (0, 1, 2), (0, 0, 1). 
(c) (1, 2, 3), (0, 4, 5), &, 3, ). 

2. Find a nontrivial solution in Z; of the system of linear homogeneous 
equations: 


Kot a6 a= 0 
X, + 2x, + 3x, = 0 


3x, + 4x, + 2x, = 0 


3. If V is a vector space of dimension n over Z,, p a prime, show that V has 
p” elements. 

4. Prove all of Lemma 5.2.1. 

5. Let F be a field and V = F[x], the polynomial ring in x over F. Considering 
V as a vector space over F, prove that V is not finite dimensional over F. 

6. If V is a finite-dimensional vector space over F and if W is a subspace of 
V, prove that: 
(a) Wis finite dimensional over F and dim,;(W) = dim;(V). 
(b) If dim;(W) = dim;(V), then V = W. 

* 7, Define what you feel should be a vector space homomorphism w& of V 
into W, where V and W are vector spaces over F. What can you say about 
the kernel, K, of & where K = {v € V| w(v) = 0}? What should a vector 
space isomorphism be? 
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10. 


11. 
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. If V is a vector space over F and W is a subspace of V, define the requi- 


site operations in V/W so that V/W becomes a vector space over F. 


. If V is a finite-dimensional vector space over F and vu,,..., u,, in V are 


linearly independent over F, show we can find w,,..., w, in V, where 
m+ r= dim;(V), such that v,,...,U,,, W1,..., W, form a basis of V 
over F. 

If #: V — V’ is a homomorphism of V onto V' with kernel K, show that 
V' = V/K (as vector spaces over F). (See Problem 7). 

Show that if dim;(V) = 1 and W is a subspace of V with dim;(W) = m, 
then dim;(V/W) =n — m. 


. If V is a vector space over F of dimension n, prove that V is isomorphic 


to the vector space of n-tuples over F (Example 1). (See Problem 7). 


Middle-Level Problems 


13. 


14. 


15. 


16. 


Let K > F be two fields; suppose that K, as a vector space over F, has fi- 
nite dimension n. Show that if a € K, then there exist ay, a,,..., a, in F, 
not all 0, such that 


Q + aat+ aa? +---+a,a" = 0. 


Let F be a field, F[x] the polynomial ring in x over F, and f(x) # 0 in 
F[x]. Consider V = F[x]/J as a vector space over F, where J is the ideal 
of F[x] generated by f(x). Prove that 


dim; (V) = deg f (x). 


If V and W are two finite-dimensional vector spaces over F, prove that 
V @ W is finite dimensional over F and that dimr(V @ W) = 
dim;(V) + dim;(W). 

Let V be a vector space over F and suppose that U and W are subspaces 
of V. Define U + W = {u+ w|u € U,w &€ W). Prove that: 

(a) U + Wis a subspace of V. 

(b) U + Wiis finite dimensional over F if both U and W are. 

(c) UM Wis asubspace of V. 

(d) U + Wis a homomorphic image of U @ W. 

(e) If U and W are finite dimensional over F, then 


dim,;(U + W) = dim,(U) + dim,(W) — dimp(UN W). 
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Harder Problems 


17. Let K > F be two fields such that dim;(K) = m. Suppose that V is a vec- 

tor space over K. Prove that: 

(a) V is a vector space over F. 

(b) If V is finite dimensional over K, then it is finite dimensional over F. 

(c) If dimg(V) = nv, then dim-(V) = mn fie., dimep(V) = 
dim, (V) dim; (K)]. 

18. Let K > F be fields and suppose that V 1s a vector space over K such that 
dim; (V) is finite. If dim;(K) is finite, show that dim, (V) if finite and de- 
termine its value in terms of dim;(V) and dim; (K). 

19. Let D be an integral domain with 1, which happens to be a finite-dimen- 
sional vector space over a field F. Prove that D is a field. (Note: Since F1, 
which we can identify with F, is in D, the ring structure of D and the vec- 
tor space structure of D over F are in harmony with each other.) 

20. Let V be a vector space over an infinite field F. Show that V cannot be the 
set-theoretic union of a finite number of proper subspaces of V. (Very hard) 


3. FIELD EXTENSIONS 


Our attention now turns to a relationship between two fields K and F, where 
K > F. We call K an extension (or extension field) of F, and call F a subfield 
of K. The operations in F are just those of K restricted to the elements of F. 
In all that follows in this section it will be understood that F C K. 

We Say that K is a finite extension of F if, viewed as a vector space over 
F, dim; (K) is finite. We shall write dim;(K) as [K : F] and call it the degree 
of K over F. 

We begin our discussion with what is usually the first result one proves 
in talking about finite extensions. 


Theorem 5.3.1. Let L D K D F be three fields such that both [L: K] 
and [K:F] are finite. Then L is a finite extension of F and [L:F] = 
[L: K][K: F]. 


Proof. We shall prove that L is a finite extension of F by explicitly ex- 
hibiting a finite basis of L over F. In doing so, we shall obtain the stronger re- 
sult asserted in the theorem, namely that [L: F] = [L: K][K: F}. 


Suppose that [L : K] = mand [K: F] =n; then L has a basis v), v>,..., VU», 
over K, and K has a basis w,, w2,..., w,, over F. We shall prove that the mn 
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elements u;w,, where i = 1, 2,..., m andj = 1, 2,...,n, constitute a basis of 
L over F. 

We begin by showing that, at least, these elements span L over F; this 
will, of course, show that L is a finite extension of F. Let a € L; since the ele- 
ments vU,,..., U,, form a basis of L over K, we have a = k,v, + ++: + k,,Um, 
where k,, k,,..., k,, are in K. Since w,,..., w, 18 a basis of K over F, we 
can express each k; as 


k, = fiw, + fiow2 + +++ +hinWn, 


where the f;; are in F. Substituting these expressions for the k; in the forego- 
ing expression of a, we obtain 


a= (fiw + frow2 +++ + finWn) U1 
sa a (fniW1 + fm2W2 ne eg finnWn) Um- 
Therefore, on unscrambling this sum explicitly, we obtain 
A = fiw, + figwov, +++ + Fijwjvi BS ties ge se 


Thus the mn elements v;w, in L span L over F; therefore, [L : F] is finite and, 
in fact, [L: F] S mn. 

To show that [L : F] = mn, we need only show that the mn elements 
u;w, above are linearly independent over F, for then—together with the fact 
that they span L over F—we would have that they form a basis of L over F. 
By Theorem 5.2.5 we would have the desired result [L:F] = mn = 
[L: K][K: F]. 

Suppose then that for some b;; in F we have the relation 


0 = b,,U1W, — 54201 W> ee ee eal bi UW, v b2,V2W, 
5 ea RC b, ,UV.W,, “bo Sie oe bm 5 Dn mWn- 


Reassembling this sum, we obtain c,v; + c,U, + ++: + C,,U,, = 0, where c; = 
byw, tec + by Wy, oe > Cm = DmiW, +++: + b,,,w,. Since the c; are ele- 
ments of K and the elements v,,..., vu, in L are linearly independent over K, 
we obtain c, =c, =:::=c, = 0. 

Recalling that c; = bjjw, + --- + b;,w,, Where the b,, are in F and 
where w,,..., w,, in K are linearly independent over F, we deduce from the 
fact that c) = c, = ++: =c,, = 0 that every b;; = 0. Thus only the trivial linear 
combination, with each coefficient 0, of the elements v;w; over F can be 0. 
Hence the vu,w; are linearly independent over F. We saw above that this was 
enough to prove the theorem. LJ 
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The reader should compare Theorem 5.3.1 with the slightly more gen- 
eral result in Problem 17 of Section 2. The reader should now be able to 
solve Problem 17. 

As a consequence of the theorem we have the 


Corollary. If L > K 2D F are three fields such that [L: F] is finite, 
then [K : F] is finite and divides [L: F]. 


Proof. Since L D K, K cannot have more linearly independent ele- 
ments over F than does L. Because, by Theorem 5.2.6, [L : F] is the size of 
the largest set of linearly independent elements in L over F, we therefore get 
that [K: F] = [L: F], so must be finite. Since L is finite dimensional over F 
and since K contains F, L must be finite dimensional over K. Thus all the 
conditions of Theorem 5.3.1 are fulfilled, whence [L: F] = [L: K][K: F]. 
Consequently, [K : F] divides [L : F], as is asserted in the Corollary. L] 


If K is a finite extension of F, we can say quite a bit about the behavior 
of the elements of K vis-a-vis F. 


Theorem 5.3.2. Suppose that K is a finite extension of F of degree n. 
Then, given any element u in K there exist elements ap, a,,..., a, in F, not 
all zero, such that 

& +aut-::-:+a,u" = 0. 


Proof. Since [K: F] = dimr(K) = n and the elements 1, u, w’,..., u” 
are n + 1 in number, by Theorem 5.2.6 they must be linearly dependent over 
F. Thus we can find ao, a;,..., @, in F, not all 0, such that ay + a,u + 
a,u’ +--+ a,u" = 0, proving the theorem. 1 


The conclusion of the theorem suggests that we single out elements in 
an extension field that satisfy a nontrivial polynomial. 


Definition. If K D F are fields, then a € K is said to be algebraic over 
F if there exists a polynomial p(x) # 0 in F[x] such that p(a) = 0. 


By p(a) we shall mean the element a,a” + a,a".' + --- + a, in K, 
where p(x) = ax" + ayx” | +--+ + a. 

If K is an extension of F such that every element of K is algebraic over 
F’, we call K an algebraic extension of F. In these terms Theorem 5.3.2 can be 
restated as: If K is a finite extension of F, then K is an algebraic extension of F. 

The converse of this is not true; an algebraic extension of F need not be 
of finite degree over F. Can you come up with an example of this situation? 
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An element of K that is not algebraic over F is said to be transcendental 
over F. 

Let’s see some examples of algebraic elements in a concrete context. 
Consider C > Q, the complex field as an extension of the rational one. The com- 
plex number a = 1 + i is algebraic over Q, since it satisfies a” — 2a + 2 = 0 


over @. Similarly, the real number b = V1 + Wh & \/2 is algebraic over Q, 


since b2=1+ V1 + 2, so (b?—-1)? =1+ V2, and therefore 
((b? — 1)° — 1)? = 2. Expanding this out, we get a nontrivial polynomial ex- 
pression in b with rational coefficients, which is 0. Thus b is algebraic over Q. 
It is possible to construct real numbers that are transcendental over Q 
fairly easily (see Section 6 of Chapter 6). However, it takes some real effort 
to establish the transcendence of certain familiar numbers. The two famil- 
iar numbers e and 7 can be shown to be transcendental over Q. That e is 
such was proved by Hermite in 1873; the proof that 7 is transcendental 
over @ is much harder and was first carried out by Lindemann in 1882. We 
shall not go into the proof here that any particular number is transcenden- 
tal over ©. However, in Section 7 of Chapter 6 we shall at least show that 
7 is irrational. This makes it a possible candidate for a transcendental num- 
ber of ©, for clearly any rational number b is algebraic over @ because it 
satisfies the polynomial p (x) = x — b, which has rational coefficients. 


Definition. A complex number is said to be an algebraic number if it 
is algebraic over Q. 


As we shall soon see, the algebraic numbers form a field, which is a 
subfield of C. 

We return to the general development of the theory of fields. We have 
seen in Theorem 5.3.2 that if K is a finite extension of F, then every element 
of K is algebraic over F. We turn this matter around by asking: If K is an ex- 
tension of F and a € K is algebraic over F, can we somehow produce a finite 
extension of F using a? The answer is yes. This will come as a consequence of 
the next theorem—which we prove in a context a little more general than 
what we really need. 


Theorem 5.3.3. ~Let D be an integral domain with 1 which is a finite- 
dimensional vector space over a field F. Then D is a field. 


Proof. To prove the theorem, we must produce for a # 0 in D an in- 
verse, a~!, in D such that aa! = 1. 
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As in the proof of Theorem 5.3.2, if dim-(D) = n, then 1, a, a*,..., a" 
in D are linearly dependent over F. Thus for some appropriate ao, a,,..., a, 
in F’, not all of which are 0, 


aa" + aa" +---+a,=0. 


Let p(x) = Box’ + Bix" ' +--+ + B, # 0 be a polynomial in F[x] of lowest 
degree such that p(a) = 0. We assert that B, # 0. For if B, = 0, then 


0 = Boa’ + Ba ' +--+ + B_,a 
= (Bya”* Wy B,a"~* pee CEB aid. 


Since D is an integral domain and a # 0, we conclude that B a”! + 
Bia’? +--+ + B,-; = 0, hence g(a) = 0, where g(x) = Box’! + 
Bx’? +--+ + B,_, in F[x] is of lower degree than p(x), a contradiction. 
Thus B, # 0, hence B;' is in F and 


a(Boa’™' 5 ade aa B,-1) =—- —] 
B, 


giving us that —(B)a’' + --- + B,_,)/B,, which is in D, is the a~' in D that 
we required. This proves the theorem. L] 


Having Theorem 5.3.3 in hand, we want to make use of it. So, how do 
we produce subrings of a field K that contain F and are finite dimensional 
over F? Such subrings, as subrings of a field, are automatically integral do- 
mains, and would satisfy the hypothesis of Theorem 5.3.3. The means to this 
end will be the elements in K that are algebraic over F. 

But first a definition. 


Definition. The element a in the extension K of F 1s said to be alge- 
braic of degree n if there is a polynomial p(x) in F[x] of degree n such that 
p(a) = 0, and no nonzero polynomial of lower degree in F[x] has this prop- 
erty. 

We may assume that the polynomial p(x) in this definition is monic, for 
we could divide this polynomial by its leading coefficient to obtain a monic 
polynomial q(x) in F[x], of the same degree as p(x), and such that q(a) = 0. 
We henceforth assume that this polynomial p(x) is monic; we call it the min- 
imal polynomial for a over F. 


Lemma 5.3.4. Let a € K be algebraic over F with minimal polynomial 
p(x) in F[x]. Then p(x) is irreducible in F[x]. 
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Proof. Suppose that p(x) is not irreducible in F[x]; then p(x) = 
f (x)g(x) where f(x) and g(x) are in F[x] and each has positive degree. Since 
0 = p(a) = f(a)g(a), and since f(a) and g(a) are in the field K, we conclude 
that f(a) = 0 or g(a) = 0, both of which are impossible, since both f(x) and 
g(x) are of lower degree than f(x). Therefore, p(x) is irreducible in F[x]. L] 


Let a € K be algebraic of degree n over F and let p(x) € F[x] be its 
minimal polynomial over F. Given f(x) € F[x], then f(x) = q(x)p(x) + r(x), 
where q(x) and r(x) are in F[x] and r(x) = 0 or deg r(x) < deg p(x) follows 
from the division algorithm. Therefore, f(a) = q(a)p(a) + r(a) = r(a), since 
p(a) = 0. In short, any polynomial expression in a over F can be expressed as 
a polynomial expression in a of degree at most n — 1. 

Let Fla] = { f(a) | f(x) € F[x]}. We claim that F[a] is a subfield of K that 
contains both F and a, and that [Fla]: F] = n. By the remark made above, 
F[a] is spanned over F by 1, a, a’,..., a""', so is finite dimensional over F. 
Moreover, as is easily verified, F[a] is a subring of K; as a subring of K, F[a] is 
an integral domain. Thus, by Theorem 5.3.3, F[a] is a field. Since it is spanned 
over F by 1, a, a*,..., a" ', we have that [Fla]: F] < n. To show that 
[Fla]: F] = n we must merely show that 1, a, a’,..., a”! are linearly inde- 
pendent over F. But if aj + aja + --- + a,_,a" | = 0, with the a; in F, then 
q(a) = 0, where g(x) = ay + ayx + +++ + a@,_1x""' is in Fx]. Since q(x) is of 
lower degree than p(x), which is the minimal polynomial for a in F[x], we are 
forced to conclude that q(x) = 0. This implies that ag = a, = -:: = a,_; = 0. 
Therefore, 1, a, a’,..., a”! are linearly independent over F and form a basis 
of F[a] over F. Thus [F[a]: F] = n. Since F[a] is a field, not merely just a set 
of polynomial expressions in a, we shall denote F[a| by F(a). Note also that if 
M is any field that contains both F and a, then M contains all polynomial ex- 
pressions in a over F, hence M 2D F(a). So F(a) is the smallest subfield of K 
containing both F and a. 


Definition. F(a) is called the field or extension obtained by adjoining 
a to F. 


We now summarize. 


Theorem 5.3.5. Let K > F and suppose that a in K is algebraic over F 
of degree n. Then F(a), the field obtained by adjoining a to F, is a finite ex- 
tension of F, and 

[F(a): F] =n. 


Before leaving Theorem 5.3.5, let’s look at it in a slightly different way. 
Let F[x] be the polynomial ring in x over F, and let M = (p(x)) be the ideal 


Sec. 3 Field Extensions 197 


of F[x] generated by p(x), the minimal polynomial for a in K over F. 
By Lemma 5.3.4, p(x) is irreducible in F[x]; hence, by Theorem 4.5.11, 
M is a maximal ideal of F[x]. Therefore, F[x]/(p(x)) is a field by Theorem 
4.4.2. 

Define the mapping yw: F[x] — K by W(f(x)) = f(a). The mapping yw is 
a homomorphism of F[x] into K, and the image of F[x] in K is merely F(a) 
by the definition of F(a). What is the kernel of w? It is by definition J = 
{ f(x) © F[x] | w(f(x)) = 0}, and since we know (f(x)) = f(a), J = {f@) € 
F[x] | f(a) = 0}. Since p(x) is in J and p(x) is the minimal polynomial for a 
over F, p(x) 1s of the lowest possible degree among the elements of J. Thus 
J = (p(x)) by the proof of Theorem 4.5.6, and so J = M. By the First Homo- 
morphism Theorem for rings, F[x]/M = image of F[x] under w = F(a), and 
since F[x]/M is a field, we have that F(a) is a field. We leave the proof, from 
this point of view, of [F(a) : F] = deg p(x) to the reader. 


PROBLEMS 


1. Show that the following numbers in C are algebraic numbers. 


(a) V2 + V3. 
(b) V7 + V12. 
(c) 2 + iV3. 
(d) cos(27/k) + isin(27/k), k a positive integer. 
2. Determine the degrees over @ of the numbers given in Parts (a) and (c) 
of Problem 1. 
3. What is the degree of cos(277/3) + i sin(277/3) over Q? 
4. What is the degree of cos(277/8) + i sin(27/8) over Q? 


5. If p is a prime number, prove that the degree of cos(27/p) + i sin(27/p) 
over @ is p — 1 and that 


WC) acti a ae oe ca a ee 


is its minimal polynomial over Q. 
6. (For those who have had calculus) Show that 


is irrational. 


7. If ain K is such that a’ is algebraic over the subfield F of K, show that a 
is algebraic over F. 
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8. If F C K and f(a) is algebraic over F, where f(x) is of positive degree in 
F[{x] anda € K, prove that a is algebraic over F. 

9. In the discussion following Theorem 5.3.5, show that F[x]/M is of degree 
n = deg p(x) over F, and so [F(a) : F] = n = deg p(x). 

10. Prove that cos 1° is algebraic over @. (1° = one degree.) 

11. If a € K is transcendental over F, let F(a) = { f(a)/g(a) | f(x), g(x) #0 € 
F[x]}. Show that F(a) is a field and is the smallest subfield of K contain- 
ing both F and a. 

12. If a is as in Problem 11, show that F(a) = F(x), where F(x) is the field of 
rational functions in x over F. 

13. Let K be a finite field and F a subfield of K. If [K: F] = n and F has q 
elements, show that K has q” elements. 

14. Using the result of Problem 13, show that a finite field has p” elements 
for some prime p and some positive integer n. 

15. Construct two fields K and F such that K is an algebraic extension of F 
but is not a finite extension of F. 


4. FINITE EXTENSIONS 


We continue in the vein of the preceding section. Again K D F will always 
denote two fields. 

Let E(K ) be the set of all elements in K that are algebraic over F. Cer- 
tainly, F C E(K ). Our objective is to prove that E(K ) is a field. Once this is 
done, we’ll see a little of how E(K ) sits in K. 

Without further ado we proceed to 


Theorem 5.4.1. E(K) is a subfield of K. 


Proof. What we must show is that if a, b € K are algebraic over F, then 
a b, ab, and a/b (if b # 0) are all algebraic over F. This will assure us that 
E(K) is a subfield of K. We’ll do all of a + b, ab, and a/b in one shot. 

Let K, = F(a) be the subfield of K obtained by adjoining a to F. Since 
a is algebraic over F, say of degree m, then, by Theorem 5.3.5, [Ky): F] = m. 
Since b 1s algebraic over F and since K, contains F, we certainly have that b 
is algebraic over Ko. If b is algebraic over F of degree n, then it is algebraic 
over K, of degree at most n. Thus K, = K,(b), the subfield of K obtained by 
adjoining b to Ko, is a finite extension of Ky and [K,: Ko] =n. 

Thus, by Theorem 5.3.1, [K,: F] = [K,: Ko|[Ko: F] S mn; that is, K, is 
a finite extension of F. As such, by Theorem 5.3.2, K, is an algebraic exten- 
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sion of F, so all its elements are algebraic over F. Since a € Ky C K, andbe 
K,, then all of the elements a + b, ab, a/b are in K,, hence are algebraic over 
F. This is exactly what we wanted. The theorem is proved. [_] 


If we look at the proof a little more carefully, we see that we have actu- 
ally proved a little more, namely the 


Corollary. If a and 5 in K are algebraic over F of degrees m and n, 
respectively, then a + b, ab, and a/b (if b # 0) are algebraic over F of degree 
at most mn. 


A special case, but one worth noting and recording, is the case K = C 
and F = Q. In that case we called the algebraic elements in C over @ the al- 
gebraic numbers. So Theorem 5.4.1 in this case becomes 


Theorem 5.4.2. The algebraic numbers form a subfield of C. 


For all we know at the moment, the set of algebraic numbers may very 
well be all of C. This is not the case, for transcendental numbers do exist; we 
show this to be true in Section 6 of Chapter 6. 

We return to a general field K. Its subfield E(K) has a very particular 
quality, which we prove next. This property is that any element in K which is 
algebraic over E(K) must already be in E(K). 

In order not to digress in the course of the proof we are about to give, 


we introduce the following notation. If a,, a,,..., a, are in K, then 
F(a,,..., @,) will be the field obtained as follows: K, = F(a,), K, = 
K,(a,) = F(a,, a2), K3 = K,(a3) = F(ay, a, a3), ... , K, = K,-1(@,) = 


F(a, Ar ,sery ay). 
We now prove 


Theorem 5.4.3. If win K is algebraic over E(K), then u is in E(K ). 


Proof. To prove the theorem, all we must do is show that u is algebraic 
over F; this will put uw in E(K ), and we will be done. 
Since u is algebraic over E(K ), there is a nontrivial polynomial f(x) = 


x" + a,x" !+ a,x"*+---+a,, where a,,a,...,a, are in E(K), such that 
f(u) = 0. Since a,, a,,..., a, are in E(K), they are algebraic over F of de- 
grees, Say, ,,m,,..., m,,, respectively. We claim that [F(a,,...,a,,):F] is 


at most m,m,---m,,. To see this, merely carry out 1 successive applications 
of Theorem 5.3.1 to the sequence K,, K,,..., K,, of fields defined above. We 
leave its proof to the reader. Thus, since u is algebraic over the field K, = 
F(a,, @,..., @,) [after all, the polynomial satisfied by u is f(x) = 
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x" + a,x" | +--++ a,, which has all its coefficients in F(a,,a,,...,a,)], the 
field K,(u) is a finite extension of K,,, and since K,, is a finite extension of F, 
we have, again by Theorem 5.3.1, that K,,(u) is a finite extension of F. Be- 
cause u € K,(u), we obtain from Theorem 5.3.2 that u is algebraic over F. 
This puts uw in E(K) by the very definition of E(K), thereby proving the theo- 
rem. C1 


There is a famous theorem due to Gauss, often referred to as the Fun- 
damental Theorem of Algebra, which asserts (in terms of extension) that the 
only finite extension of C, the field of complex numbers, is C itself. In reality 
this result is not a purely algebraic one, its validity depending heavily on 
topological properties of the field of real numbers. Be that as it may, it is an 
extremely important theorem in algebra and in many other parts of mathe- 
matics. 

The formulation of the Fundamental Theorem of Algebra in terms of 
the nonexistence of finite extensions of C is a little different from that which 
is usually given. The most frequent form in which this famous result is stated 
involves the concept of a root of a polynomial, a concept we shall discuss at 
some length later. In these terms the Fundamental Theorem of Algebra be- 
comes: A polynomial of positive degree having coefficients in C has at least 
one root in C. The exact meaning of this statement and its equivalence with 
the other form of the theorem stated above will become clearer later, after 
the development of the material on roots. 

A field L with the property of C described in the paragraphs above is 
said to be algebraically closed. If we grant that C is algebraically closed 
(Gauss’ Theorem), then, by Theorem 5.4.3, we also have 


The field of algebraic numbers is algebraically closed. 


PROBLEMS 


1. Show that a = V2 — V3 is algebraic over @ of degree at most 4 by ex- 
hibiting a polynomial f(x) of degree 4 over Q such that f(a) = 0. 


2. If a and 5 in K are algebraic over F of degrees m and n, respectively, and 
if m and n are relatively prime, show that [F(a, b): F] = mn. 


3. If a € C is such that p(a) = 0, where 
p(x) =O + V29x3 + V5x2 4+ V7 x + V12, 


show that a is algebraic over @ of degree at most 80. 
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4. If K D Fis such that [K : F] = p, p a prime, show that K = F(a) for every 
ain K that is not in F. 


5. If [K : F] = 2” and Tis a subfield of K containing F, show that [T: F] = 2” 
for some m =n. 


6. Give an example of two algebraic numbers a and b of degrees 2 and 3, re- 
spectively, such that ab is of degree Jess than 6 over Q. 


7. If K D Fare fields and a,,...,a,, are in K, show that F(a,,...,a,) equals 
F(4gq1), +++» 4o(n)) for any permutation a of 1, 2,..., 7. 


5. CONSTRUCTIBILITY 


In ancient Greece, unlike in the other cultures of the time, the Greek mathe- 
maticians were interested in mathematics as an abstract discipline rather 
than as a pragmatic bag of tricks to do accounts or to carry out measure- 
ments. They developed strong interests and results in number theory and, 
most especially, in geometry. In these areas they posed penetrating ques- 
tions. The questions they asked in geometry—two of which will make up the 
topic treated here—are still of interest and substance. The English mathe- 
matician G. H. Hardy, in his sad but charming little book A Mathematician’s 
Apology, describes the ancient Greek mathematicians as “colleagues from 
another college.” 

Two of these Greek questions will be our concern in this section. But, 
as a matter of fact, the answer to both will emerge as a consequence of the 
criterion for constructibility, which we will obtain. We state these questions 
now and will explain a little later what is entailed in them. 


QUESTION 1 


Can one duplicate a cube using just straight-edge and compass? (By duplicat- 
ing a cube, we mean doubling its volume. ) 


QUESTION 2 
Can one trisect an arbitrary angle using just straight-edge and compass? 


Despite the seemingly infinite number of angle-trisectors that crop up 
every year, the answer to both questions is “no.” As we shall see, it is impos- 
sible to trisect 60° using just straight-edge and compass. Of course, some 
angles are trisectable, for instance, 0°, 90°, 145°, 180°,..., but most angles 
(in a very precise meaning of “most”) are not. 
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Before getting to the exact meaning of the questions themselves, we 
want to spell out in explicit terms exactly what the rules of the game are. 
By a straight-edge we do not mean a ruler—that is, an instrument for mea- 
suring arbitrary lengths. No/ A straight-edge is merely a straight line, 
with no quantitative or metric properties attributed to it. We are given a 
line segment—to which we assign length 1—and all other lengths that we 
get from this must be obtainable merely employing a straight-edge and 
compass. 

Let us call a nonnegative real number, b, a constructible length if, by 
a finite number of applications of the straight-edge and compass and the 
points of intersection obtained between lines and circles so constructed, 
we can construct a line segment of length b, starting out from the line seg- 
ment we have assigned length 1. 

From our high school geometry we recall some things we can do in this 
framework. 


1. Whatever length we construct on one line can be constructed on any 
other line by use of the compass acting as a transfer agent. 


2. We can draw a line parallel to a given line that goes through a given 
point. 
3. We can construct a length m for any nonnegative integer n. 


From these and by using results about the similarity of triangles, we can 
construct any nonnegative rational length. We don’t do that at this moment 
for it will come out as a special case of what we are about to do. 

We claim the following properties: 

1. If a and b are constructible lengths, then so is a + b. For if AB isa 
length segment of length a and CD is one of length b, we can transfer this 
line segment CD, by means of a compass, to obtain the line ABE, where AB 
is of length a and BE is of length b. Thus the line segment AF is of length 
a+ b. If 6b > a, how would you construct b — a? 

2. If a and b are constructible lengths, then so is ab. We may assume 
that a # 0 and b # 0, otherwise, the statement is trivial. Consider the follow- 
ing diagram 


Ly 
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Where L, and L, are two distinct lines intersecting at P, and such that PA 
has length a, PB has length b, and PJ has length 1. Let L; be the straight line 
through J and A and L, the line parallel to L, passing through B. If C is the 
point of intersection of L, and L,, we have the diagram 


All of these constructions can be carried out by straight-edge and compass. 
From elementary geometry the length of PC is ab. Therefore, ab is con- 


Structible. 
3. If a and b are constructible and b # 0, then a/b is constructible. Con- 


sider the diagram 


i fe 


where P, A, B, J, L,, and L, are as in Property 2 above. Let L; be the line 
through A and B and let L, be the line through J parallel to L;. If D is the 
point of intersection of L, and L., then, again by elementary geometry, the 
length of PD is a/b. We stress again that all the constructions made can be 
carried out by straight-edge and compass. 

Of course, this shows that the nonnegative rational numbers are con- 
structible lengths, since they are quotients of nonnegative integers, which we 
know to be constructible lengths. But there are other constructible lengths, 
for instance, the irrational number V2. Because we can construct by straight- 
edge and compass the right-angle triangle 


C 


Pa 


204 Fields Ch. 5 


with sides AB and BC of length 1, we know, by the Pythagorean Theorem, 
that AC is of length V2. So V2 is a constructible length. 

In Properties 1 to 3 we showed that the constructible lengths almost 
form a field. What is lacking is the negatives. To get around this, we make 
the 


Definition. The real number a is said to be a constructible number if 
|a|, the absolute value of a, is a constructible length. 


As far as we can say at the moment, any real number might be a con- 
structible one. We shall soon have a criterion which will tell us that certain 
real numbers are not constructible. For instance, we shall be able to deduce 
from this criterion that both /2 and cos 20° are not constructible. This in 
turn will allow us to show that the answer to both Questions 1 and 2 is “no.” 

But first we state 


Theorem 5.5.1. The constructible numbers form a subfield of the field 
of real numbers. 


Proof. Properties 1 to 3 almost do the trick; we must adapt Property 1 
slightly to allow for negatives. We leave the few details to the reader. L] 


Our next goal is to show that a constructible number must be an alge- 
braic number—not any old algebraic number, but one satisfying a rather 
stringent condition. 

Note, first, that if @ = O is a constructible number, then so is Va. Con- 
sider the diagram 


A Cc 6D B 


It is of a semicircle of radius (a + 1)/2, center at C, AD is of length a, DB is 
of length 1, and DE is the perpendicular to AB at D, intersecting the circle at 
E. All this is constructible by straight-edge and compass. From elementary 
geometry we have that DE is of length Va, hence V4 is constructible. 

We now head for the necessary condition that a real number be con- 
structible. Let K be the field of constructible numbers, and let K, be a sub- 
field of K. By the plane of Ky we shall mean the set of all points (a, b) in the 
real Euclidean plane whose coordinates a and b are in K. If (a, b) and (c, d) 
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are in the plane of Ko, then the straight line joining them has the equation 
(y — b)/(x — a) = (b — d)/(a — c), so is of the form ux + vy + w = 0, where 
u, v, and w are in Ky. Given two such lines u,x + v,;y + w, = O and 
UX + Uy + Wy = 0, where uy, v,, W, and uy, V2, w2 are all in Ko, either they 
are parallel or their point of intersection is a point in Ky. (Prove!) 

Given a circle whose radius r is in Ky and whose center (a, b) is in the 
plane of Ky, then its equation is (x — a)* + (y — b)* = r’, which we see, on 
expanding, is of the form x° + y* + dx + ey + f = 0, where d, e, and f are in Kp. 
To see where this circle intersects a line in the plane of Ko, ux + vy + w = 0, 
we solve simultaneously the equations of the line and of the circle. For in- 
stance, if v # 0, then y = —(ux + w)/v; substituting this for y in the equation 
of the circle x? + y” + dx + ey + f = 0 leads us to a quadratic equation for 
the x-coordinate, c, of this intersection point, of the form c? + s,;c + s, = 0, 
with s,; and s, in Ky. By the quadratic formula, c = (—s, + Vs? — 4s,)/2, 
and if the line and circle intersect in the real plane, then sj — 4s, = 0. If 

= s+ — 4s, = Oand if K, = K,(Vs), then we see that the x-coordinate, c, 
lies in K,. If Vs € Ko, then K, = Ko; otherwise, [K,:K,] = 2. Since the 
y-coordinate d = (—uc + w)/v, we have that d is also in K,. Thus the inter- 
section point (c, d) lies in the plane of K, where [K, : Ky] = 1 or 2. The story 
is similar if v = O and u # 0. 

Finally, to get the intersection of two circles x7 + y° + dx t+ ey + f=0 
and x* + y* + gx + hy + k = 0 in the plane of Ko, subtracting one of these 
equations from the other gives us the equation of the line in the plane of Ko, 
(d — g)x + (e —h)y + (f — k) = 0. So to find the points of intersection of 
two circles in the plane of K, is the same as finding the points of intersection 
of a line in the plane of K, with a circle in that plane. This is precisely the situa- 
tion we disposed of above. So if the two circles intersect in the real plane, their 
points of intersection lie in the plane of an extension of Ky of degree 1 or 2. 

To construct a constructible length, a, we start in the plane of Q, the ra- 
tionals; the straight-edge gives us lines in the plane of @, and the compass 
circles in the plane of Q. By the above, these intersect at a point in the plane 
of an extension of degree 1 or 2 of Q. To get to a, we go by this procedure 
from the plane of Q to that of L,, say, where [L, : Q] = 1 or 2, then to that 
of L,, where [L,: L,] = 1 or 2, and continue a finite number of times. We 
get, this way, a finite sequence 2 = L, CL, C:-:CL, of fields, where each 
[L;: L;-1] = 1 or 2 and where a is in L,,. 

By Theorem 5.3.1, [L,: Q] = [L,: L,-1|[E,-1: Lz-2] +++ [£1 : Q] and 
since each of [L;: L;_,] = 1 or 2, we see that [L,,: Q] is a power of 2. Since 
a © L,, we have that Q(a) is a subfield of L,,, hence by the Corollary to Theo- 
rem 5.3.1, [Q(a): Q] must divide a power of 2, hence [Q(a):Q] = 2” for 
some nonnegative integer m. Equivalently, by Theorem 5.3.5, the minimal 
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polynomial for a over @ must have degree a power of 2. This is a necessary 
condition that a be constructible. We have proved the important criterion for 
constructibility, namely 


Theorem 5.5.2. In order that the real number a be constructible, it is 
necessary that [Q(a) : Q] be a power of 2. Equivalently, the minimal polyno- 
mial of a over @ must have degree a power of 2. 


To duplicate a cube of sides 1, so of volume 1, by straight-edge and 
compass would require us to construct a cube of sides of length b whose vol- 
ume would be 2. But the volume of this cube would be b°, so we would have 
to be able to find a constructible number b such that b° = 2. 

Given a real number b such that b°® = 2, then its minimal polynomial 
over Q is p(x) = x° — 2, for this polynomial is monic and irreducible over Q (if 
you want, by the Eisenstein Criterion), and p(b) = 0. Also, as is clear to the 
eye, p(x) is of degree 3. Since 3 is not a power of 2, by Theorem 5.5.2, there is 
no such constructible b. Therefore, the question of the duplication of the cube 
by straight-edge and compass has a negative answer. We summarize this in 


Theorem 5.5.3. It is impossible to duplicate a cube of volume 1 by 
straight-edge and compass. 


We now have disposed of the classical Question 1, so we turn our atten- 
tion to Question 2, the trisection of an arbitrary angle by straight-edge and 
compass. 

If we could trisect the particular angle 60°, we would be able to con- 
struct the triangle ABC in the diagram 


A B 


where 6 = 20° and AC is of length 1, by straight-edge and compass. Since AB 
is of length cos 20°, we would have that b = cos 20° is a constructible number. 

We want to show that b = cos 20° is not a constructible number by pro- 
ducing its minimal polynomial over @, and showing that this polynomial is of 
degree 3. To this end we recall the triple-angle formula from trigonometry, 
namely that cos3¢ = 4 cos’¢@ — 3 cos@¢. If b = cos20°, then, since 
cos(3 - 20°) = cos 60° = §, this trigonometric formula becomes 4b° — 3b = 3, 
and so 8b° — 6b — 1 = 0. If c = 2b, this becomes c* — 3c — 1 = 0. If b is con- 
structible, then so is c. But p(c) = 0, where p(x) = x° — 3x — 1, and this 
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polynomial is irreducible over Q. (Prove!) So p(x) is the minimal polynomial 
for c over Q. Because p(x) is of degree 3, and 3 is not a power of 2, by Theo- 
rem 5.5.2 we have that c is not constructible. So we cannot trisect 60° by 
straight-edge and compass. This answers Question 2 in the negative. 


Theorem 5.5.4. It is impossible to trisect 60° by straight-edge and 
compass. 


We hope that this theorem will dissuade any reader from joining the 
hordes of angle-trisectors. There are more profitable and pleasanter ways of 
wasting one’s time. 

There is yet another classical problem of this kind to which the answer 
is “no.” This is the question of squaring the circle. This question asks: Can 
we construct a square whose area is that of a circle of radius 1 by straight- 
edge and compass? This is equivalent to asking whether Var is a con- 
structible number. If this were the case, then since 7 = (V2), the number 7 
would be constructible. But Lindemann proved in 1882 that 77 is in fact tran- 
scendental, so certainly is not algebraic, and so cannot be constructible. There- 
fore, the circle of radius 1 cannot be squared by straight-edge and compass. 

Of course, what we did above does not constitute a proof of the impos- 
sibility of squaring the circle, since we have presupposed Lindemann’s result 
without proving it. To prove that 7 is transcendental would take us too far 
afield. One might expect that it would be easier to prove that 7 is not con- 
structible than to prove that it is not algebraic. This does not seem to be the 
case. Until now all proofs that 7 is not constructible go via the route of ex- 
ploiting the transcendence of 7. 


PROBLEMS 


1. Complete the proof of Theorem 5.5.1. 
2. Prove that x° — 3x — 1 is irreducible over Q. 
3. Show that the construction given for Va, a=0 does indeed give us Va. 


4. Prove that the regular heptagon (seven-sided polygon with sides of equal 
length) is not constructible by straight-edge and compass. 


6. ROOTS OF POLYNOMIALS 


Let F[x], as usual, be the polynomial ring in x over the field F and let K be an 
extension field of F. If a © K and 


f(x) =a taxt---t+a,x", 
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then by f(a) we understand the element 
f(a) =a + aat+-++ + a,a” 


in K. This is the usage we have made of this notation throughout this chap- 
ter. We will now be interested in those a’s in K such that f(a) = 0. 


Definition. The element a € K is a root of the polynomial f(x) € F[x] 
if f(a) = 0. 


In what we have done up until now we have always had an extension 
field K of F given to us and we considered the elements in K algebraic over 
F, that is, those elements of K that are roots of nonzero polynomials in F[x]. 
We saw that if a € K is algebraic over F of degree n—that is, if the minimal 
polynomial for a over F is of degree n—then [F(a): F] = n, where F(a) is 
the subfield of K obtained by adjoining a to F. 

What we do now is turn the problem around. We no longer will have 
the extension K of F at our disposal. In fact, our principal task will be to pro- 
duce it almost from scratch. We start with some polynomial f(x) of positive 
degree in F[x] as our only bit of data; our goal is to construct an extension 
field K of F in which f(x) will have a root. Once we have this construction of 
K under control, we shall elaborate on the general theme, thereby obtaining 
a series of interesting consequences. 

Before setting off on this search for the appropriate K, we must get 
some information about the relation between the roots of a given polynomial 
and the factorization of that polynomial. 


Lemma 5.6.1. If a€ L is a root of the polynomial f(x) € F[x] of de- 
gree n, where L is an extension field of F, then f(x) factors in L[x] as f(x) = 
(x — a)q(x), where q(x) is of degree n — 1 in L[x]. Conversely, if f(x) = 
(x — a)q(x), with f(x), q(x), and a as above, then a is a root of f(x) in L. 


Proof. Since F C L, F[x] is contained in L[x]. Because a € L, x — ais 
in L[{x]; by the Division Algorithm for polynomials, we have f(x) = 
(x — a)q(x) + r(x), where q(x) and r(x) are in L[x] and where r(x) = 0 or 
deg r(x) < deg(x — a) = 1. This yields that r(x) = b, some element of L. 
Substituting a for x in the relation above, and using the fact that f(a) = 0, we 
obtain 0 = (a — a)q(a) + b=0+ b = 5b; thus b = 0. Since r(x) = b = 0, we 
have what we wanted, namely f(x) = (x — a)q(x). 

For the statement that deg q(x) = n — 1 we note that since f(x) = 
(x — a)q(x), then, by Lemma 4.5.2, n = deg f(x) = deg(x — a) + deg q(x) = 
1 + deg q(x). This gives us the required result, deg q(x) =n — 1. 

The converse is completely trivial. L] 
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One immediate consequence of Lemma 5.6.1 1s 


Theorem 5.6.2. Let f(x) in F[x] have degree n; then f(x) can have at 
most n roots in any extension, K, of F. 


Proof. We go by induction on n. If n = 1, then f(x) = ax + b, where a 
and b are in F and where a # 0. Thus the only root of f(x) is —b/a, an ele- 
ment of F. 

Suppose that the theorem is correct for all polynomials of degree k — 1 
over any field. Suppose that f(x) in F[x] is of degree k. If f(x) has no roots in 
K, then the theorem 1s certainly correct. Suppose, then, that a € K is a root of 
f(x). By Lemma 5.6.1, f(x) = (x — a)q(x), where q(x) is of degree k — 1 in 
K[x]. Any root b in K of f(x) is either a or is a root of q(x), since 0 = f(b) = 
(b — a)q(b). By induction, g(x) has at most k — 1 roots in K, hence f(x) has at 
most k roots in K. This completes the induction and proves the theorem. L] 


Actually, the proof yields a little more. To explain this “little more,” we 
need the notion of the multiplicity of a root. 


Definition. If K is an extension of F, then the element a in K 1s a root 
of multiplicity k > 0 of f(x), where f(x) is in F[x], if f(x) = (x — a)*q(x) for 
some q(x) in K[x] and x — a does not divide q(x) (or, equivalently, where 
q(a) # 0). 


The same proof as that given for Theorem 5.6.2 yields the sharpened 
version: 

Let f(x) be a polynomial of degree n in F(x]; then f(x) can have at most n 
roots in any extension field K of F, counting a root of multiplicity k as k roots. 


Theorem 5.6.3. Let f(x) in F[x] be monic of degree n and suppose 
that K is an extension of F in which f(x) has n roots, counting a root of multi- 
plicity k as k roots. If these roots in K are a, aj,..., a,,, each having 
multiplicity k,, k.,..., k,, respectively, then f(x) factors in K[x] as f(x) = 
(x — ay)"'(x — any? +++ (x — ay). 


Proof. The proof is easy by making use of Lemma 5.6.1 and of induc- 
tion on n. We leave the carrying out of the proof to the reader. L] 


Definition. We say that f(x) in F[x] splits into linear factors over (or 
in) K if f(x) has the factorization in K[x] given in Theorem 5.6.3. 


There is a nice application of Theorem 5.6.3 to finite fields. Let F be a 
finite field having q elements, and let a,, a,,..., @,_-; be the nonzero ele- 
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ments of F. Since these form a group of order g — 1 under the multiplication 
in F, by Theorem 2.4.5 (proved ever so long ago), a? ' = 1 for any a # Oin 
F. Thus the polynomial x7~' — 1 in F[x] has q — 1 distinct roots in F. By 
Theorem 5.6.3, the polynomial x7 ' — 1 = (x — a,)(x — ay) +++ (x — ag_,). If 
we also consider 0, then every element a in F satisfies a? = a, so that the 
polynomial x?’ — x has the q elements of F as its distinct roots. By Theorem 
5.6.3 we have 


Theorem 5.6.4. Let F be a finite field having g elements. Then x? — x 
factors in F[x] as 


RE ey Oe ae Fe), 
where 4, @,,...,4,_, are the nonzero elements of F, and 
xt '—1=(% -a)~- a) - ay). 


A very special case of this theorem is that in which F = Z,, the integers 
modulo the prime p. Here gq = p and a,,a>,... ,4,-, are just 1,2,...,p —1 
in some order. 


Corollary. In Z,[x], the polynomial x?~' — 1 factors as 
xP“) — 1 = (x — 1) — 2):--( —- (p — 1). 


Try this out for p = 5, 7, and 11. 

As a corollary to the corollary, we have a result in number theory, 
known as Wilson’s Theorem, which we assigned as Problem 18 in Section 4 of 
Chapter 2. 


—1 mod p. 


Corollary. If pis a prime, then (p — 1)! 
Proof. By the Corollary above, 
xP-1 — 2 = (x — I) — 2)---& - (Pp - 1) 
substituting x = 0 in this gives us 


=1 = (=1)(=2)-+*(-(p = 1) = (HP 2 (p= DY 


= (-1"(p - 2) 
in Z,,. In the integers this translates into “congruent mod p.” Thus 
(—1)?"(p - 1)! = —1modp, 


and so (p — 1)! = (-—1)? mod p. But (—1)? = —1 mod p; hence we have 
proved Wilson’s Theorem. (_] 
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We change direction to consider the problem mentioned at the begin- 
ning of this section: given f(x) € F[x], to construct a finite extension K of F 
in which f(x) has a root. As we shall see in a moment, this construction of K 
will be quite easy when we bring the results about polynomial rings proved in 
Chapter 4 into play. However, to verify that this construction works will take 
a bit of work. 


Theorem 5.6.5. Let F be a field and f(x) a polynomial of positive de- 
gree nin F[x]. Then there exists a finite extension K of F, with [K : F] =n, in 
which f(x) has a root. 


Proof. By Theorem 4.5.12, f(x) is divisible in F[x] by some irreducible 
polynomial p(x) in F[x]. Since p(x) divides f(x), deg p(x) = deg f(x) = n, 
and f(x) = p(x)q(x) for some polynomial q(x) in F[x]. If b is a root of p(x) 
in some extension field, then b is automatically a root of f(x), since f(b) = 
p(b)q(b) = 0q(b) = 0. So to prove the theorem it is enough to find an exten- 
sion of F in which p(x) has a root. 

Because p(x) is irreducible in F[x], the ideal M = (p(x)) of F[x] gener- 
ated by p(x) is a maximal ideal of F[x] by Theorem 4.5.11. Thus by Theorem 
4.4.2, K = F[x|/M is a field. We claim that this is the field that we are seeking. 

Strictly speaking, K does not contain F’; as we now show, however, K 
does contain a field isomorphic to F. Since every element in M is a multiple in 
F[x] of p(x), every such nonzero element must have degree at least that of 
p(x). Therefore, MM F = (0). Thus the homomorphism w: F[x] — K defined 
by w(g(x)) = g(x) + M for every g(x) in F[x], when restricted to F, is 1 — 1 on 
F. Therefore, the image F of F in K is a field isomorphic to F. We can identify 
F, via w, with F and so, in this way, we can consider K an extension of F. 

Denote x + M € K bya, so that w(x) = a, a € K. We leave it to the 
reader to show, from the fact that y is a homomorphism of F[x] onto K with 
kernel M, that w(g(x)) = g(a) for every g(x) in F[x]. What is w(p(x))? On 
the one hand, since p(x) is in F[x], W(p(x)) = p(a). On the other hand, since 
p(x) is in M, the kernel of W, &(p(x)) = 0. Equating these two evaluations of 
w(p(x)), we get that p(a) = 0. In other words, the element a = h(x) in K is a 
root of p(x). 

To finish the proof, all we need is to show that [K : F] = deg p(x) =n. 
This came up earlier, in the alternative proof we gave of Theorem 5.3.5. 
There we left this point to be proved by the reader. We shall be a little more 
generous here and carry out the proof in detail. 

Given h(x) in F[x], then, by the Division Algorithm, h(x) = p(x)q(x) + 
r(x) where q(x) and r(x) are in F[x], and r(x) = 0 or deg r(x) < deg p(x). 
Going modulo M, we obtain that 
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w(h(x)) = (p(x)a(x) + r(x) = W(p@)a(x)) + Wr) 
= W(p(x))Wa(x)) + W(r(*)) 
= W(r(x)) = r(@) 


[since #(p(x)) = p(a) = 0} 

So, since every element in K = F[x]/M is w(h(x)) for some h(x) in F[x] 
and w(h(x)) = r(a), we see that every element of K is of the form r(a), where 
r(x) is in F[x] and deg r(x) < deg p(x). If deg p(x) = m, the discussion just 
made tells us that 1, a, a*,...,a™ ' span K over F. Moreover, these elements 
are linearly independent over F, since a relation of the type ay + aja +::: 
+ a,-,a” ' = 0 would imply that g(a) = 0 where g(x) = ay + ayx +--+ + 
Am 1x” | is in F[x]. This puts g(x) in M, which is impossible since g(x) is of 
lower degree than p(x), unless g(x) = 0. In other words, we get a contradic- 
tion unless aj = a, =-:: =a,,-; = 0. So the elements 1, a, a’,..., a”! are 
linearly independent over F. Since they also span K over F, they form a basis 
of K over F. Consequently, 


dim; K = [K: F] = m = deg p (x) Sn = deg f (x). 


The theorem is proved. [1] 


We carry out an iteration of the argument used in the last proof to 
prove the important 


Theorem 5.6.6. Let f(x) © F[x] be of degree n. Then there exists an 
extension K of F of degree at most n! over F such that f(x) has n roots, count- 
ing multiplicities, in K. Equivalently, f(x) splits into linear factors over K. 


Proof. We go by induction on n. If n = 1, then f(x) = a + Bx, where a, 
B € F and where B # O. The only root of f(x) is —a/B, which is in F. Thus 
K = Fand[K: F] = 1. 

Suppose that the result is true for all fields for polynomials of degree k, 
and suppose that f(x) € F[x] is of degree k + 1. By Theorem 5.6.5 there ex- 
ists an extension K, of F with [K,: F] = k + 1 in which f(x) has a root a. 
Thus in K,[x], f(x) factors as f(x) = (x — a,)q(x), where q(x) € K,[x] is of 
degree k. By induction there exists an extension K of K, of degree at most k! 
over K, over which q(x) splits into linear factors. But then f(x) splits into lin- 
ear factors over K. Since [K: F] = [K: K,][K,: F] S (k + 1)k! = (k + 1)!, 
the induction is completed and the theorem is proved. [_] 


We leave the subject of field extensions at this point. We are exactly at 
what might be described as the beginning of Galois theory. Having an exten- 
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sion K of F of finite degree over which a given polynomial f(x) splits into lin- 
ear factors, there exists an extension of /east degree enjoying this property. 
Such an extension is called a splitting field of f(x) over F. One then proceeds 
to prove that such a splitting field is unique up to isomorphism. Once this is 
in hand the Galois theory goes into full swing, studying the relationship be- 
tween the group of automorphisms of this splitting field and its subfield struc- 
ture. Eventually, it leads to showing, among many other things, that there 
exist polynomials over the rationals of all degrees 5 or higher whose roots 
cannot be expressed nicely in terms of the coefficients of these polynomials. 

This is a brief and very sketchy description of where we can go from 
here in field theory. But there is no hurry. The readers should assimilate the 
material we have presented; this will put them in a good position to learn 
Galois theory if they are so inclined. 


PROBLEMS 


1. Prove Theorem 5.6.3. 

2. If F is a finite field having the g — 1 nonzero elements a,, a2,..., a-1, 
prove that a,a,---a,_, = (—1)’. 

3. Let Q be the rational field and let p(x) = x4 + x° + x? + x + 1. Show 
that there is an extension K of Q with [K : Q] = 4 over which p(x) splits 
into linear factors. [Hint: Find the roots of p(x).] 

4. If g(x) =x" + a,x"! +--++4,, a, # 0, is a polynomial with integer 
coefficients and if the rational number 7 is a root of q(x), prove that r is 
an integer and r|a,,. 

5. Show that g(x) = x° — 7x + 11 is irreducible over Q. 

6. If Fis a field of characteristic p + 0, show that (a + b)? = a? + b? for all 
a and b in F. 

7. Extend the result of Problem 6 by showing that (a + b)” = a” + b”, 
where m = p”. 

8. Let F = Z,, p a prime, and consider the polynomial x” — x in Z,[x], 
where m = p”. Let K be a finite extension of Z, over which x” — x splits 
into linear factors. In K let K, be the set of all roots of x” — x. Show that 
Ky 1s a field having at most p” elements. 

9. In Problem 8 show that Ky has exactly p” elements. (Hint: See Problem 
14.) 

10. Construct an extension field K,, of Q@ such that [K,,:Q] = n, for any 
n= 1. 


214 Fields Ch. 5 


11. Define the mapping 6: F[x] — F[x] by 
6 (ay + a,x + anx* +--+ + a,x") 
Sy 2a tea ee AP ex, 
Prove that: 


(a) 5(f(x) + g(x) = 6( F(x) + 6(g(x)). 
(b) 5(f(x)g(x)) = F(x) 6(g(x)) + 8(F(x))g(e) for all f(x) and g(x) in F[x]. 


12. If F is of characteristic p # 0, characterize all f(x) in F[x] such that 
5(f(x)) = 0. 

13. Show that if f(x) in F[x] has a root of multiplicity greater than 1 in some 
extension field of F, then f(x) and 8(f(x)) are not relatively prime in 
F[x]. 

14. If F is of characteristic p # 0, show that all the roots of x” — x, where 
m = p", are distinct. 

15. If f(x) in F[x] is irreducible and has a root of multiplicity greater than 1 
in some extension of F’, show that: 

(a) / must be of characteristic p for some prime p. 
(b) f(x) = g(x’) for some polynomial g(x) in F[x]. 


6 


SPECIAL TOPICS (OPTIONAL) 


In this final chapter we treat several unrelated topics. One of these comes 
from group theory, and all the rest from the theory of fields. In handling 
these special topics, we draw from many of the results and ideas developed 
earlier in the book. Although these topics are somewhat special, each of 
them has results that are truly important in their respective areas. 

The readers who have managed to survive so far should have picked up 
a certain set of techniques, experience, and algebraic know-how to be able to 
follow the material with a certain degree of ease. We now feel free to treat 
the various matters at hand in a somewhat sketchier fashion than we have 
heretofore, leaving a few more details to the reader to fill in. 

The material we shall handle does not lend itself readily to problems, at 
least not to problems of a reasonable degree of difficulty. Accordingly, we 
will assign relatively few exercises. This should come as a relief to those 
wanting to assimilate the material in this chapter. 


1. THE SIMPLICITY OF A, 


In Chapter 3, where we discussed S,, the symmetric group of degree n, we 
showed that if n = 2, then S,, has a normal subgroup A,,, which we called the 
alternating group of degree n, which is a group of order n!/2. In fact, A, was 
merely the set of all even permutations in S,,. 
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In discussing A,,, we said that A,,, for n = 5, was a simple group, that is, 
that A, has no normal subgroups other than (e) and itself. We promised 
there that we would prove this fact in Chapter 6. We now make good on this 
promise. 

To make clear what it is that we are about to prove, we should perhaps 
repeat what we said above and formally define what is meant by a simple 


group. 


Definition. A nonabelian group is said to be simple if its only normal 
subgroups are (e) and itself. 


We impose the proviso that G be nonabelian to exclude the trivial ex- 
amples of cyclic groups of prime order from the designation “simple.” These 
cyclic groups of prime order have no nontrivial subgroups at all, so, perforce, 
they have no proper normal subgroups. An abelian group with no proper 
subgroups Is easily seen to be cyclic of prime order. 

We begin with the very easy 


Lemma 6.1.1. If n = 3 and 7,, 7, are two transpositions in S,,, then 7,7, 
is either a 3-cycle or the product of two 3-cycles. 


Proof. If 7, = 1, then 7,7, = tT] = e and e is certainly the product of 
two 3-cycles, for instance as e = (123)(132). 

If 7; # 7,, then they either have one letter in common or none. If they 
have one letter in common, we may suppose, on a suitable renumbering, that 

= (12) and 7, = (13). But then 7,7, = (12)(13) = (132), which is already a 
3. sevele 

Finally, if 7, and 7, have no letter in common, we may suppose, without 
loss of generality, that 7, = (12) and 7, = (34), in which case 7,7, = (12)(34) = 
(142)(143), which is indeed the product of two 3-cycles. The lemma is now 
proved. [] 


An immediate consequence of Lemma 6.1.1 is that for n = 3 the 
3-cycles generate A,,, the alternating group of degree n. 


Theorem 6.1.2. If ois an even permutation in S,, where n = 3, then a 
is a product of 3-cycles. In other words, the 3-cycles in S, generate A,,. 


Proof. Let o € S,, be an even permutation. By the definition of the par- 
ity of a permutation, o is a product of an even number of transpositions. 
Thus 0 = 1172 *** 73-172) *** T2m—1T2m IS a product of 2m transpositions 
T1, T2,-++ 5 T2m- By Lemma 6.1.1, each 7,;_ ,7; 1s either a 3-cycle or a product 
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of two 3-cycles. So we get that o is either a 3-cycle or the product of at most 
2m 3-cycles. This proves the theorem. [] 


We now give an algorithm for computing the conjugate of any permu- 
tation in S,. Let a € S,, and suppose that o(i) = j. What does tar! look 
like if t € S,? Suppose that r(i) = s and t(j) = ¢t; then ror '(s) = 
ta(t '(s)) = to(i) = 7(j) = t. In other words, to compute tat! replace 
every symbol in a by its image under 7. 

For instance, if o = (123) and 7 = (143), then, since 7(1) = 4, 7(2) = 2, 
7(3) = 1, and 7(4) = 3, we see that rart~' = (421) = (142). 

Given two k-cycles, say (12 --- k) and (i,/,--- i,), then they are conju- 
gate in S, because if 7 is a permutation that sends 1 into 7,, 2 intoi,,...,k 
into i,, then 7(12 --- k)r | = (i,i,,..., i,). Since every permutation is the 
product of disjoint cycles and conjugation is an automorphism, we get, from 
the result for k-cycles, that to compute tar ' for any permutation a, replace 
every symbol in a by its image under 7. In this way we see that it is extremely 
easy to compute the conjugate of any permutation. 

Given two permutations a, and a, in S,,, then they are conjugate in S,, 
using the observation above, if in their decompositions into products of dis- 
joint cycles they have the same cycle lengths and each cycle length with the 
same multiplicity. Thus, for instance, (12)(34)(567) and (37)(24)(568) are 
conjugate in S,, but (12)(34)(567) and (37)(568) are not. 

Recall that by a partition of the positive integer n, we mean a decompo- 
sition of nasn =n, +n,+--:+n,,whereO Sn, Sn, 5:::-Sn,. If cin 
S,, 18 the disjoint product of an n,-cycle, an n,-cycle,..., an n,-cycle, then 
n, +n,+-++++n, =n, and a permutation 7 is conjugate to aif and only if 7 
is the disjoint product of cycles in the same way. Therefore, the number of 
conjugacy classes in S, is equal to the number of partitions of n. 

For instance, if m = 4, then the partitions of 4 are 4 = 4, 4 = 1 + 3, 
4=1+1+4+2,4=1+1+141, and 4 = 2 + 2, which are five in num- 
ber. Thus S, has five conjugacy classes, namely the classes of (1234), (123), 
(12), e, and (12)(34), respectively. 

We summarize everything we said above in three distinct statements. 


Lemma 6.1.3. To find 7a7~' in S,,, replace every symbol in the cycle 
structure of o by its image under +. 


Lemma 6.1.4. Two elements in S, are conjugate if they have similar 
decompositions as the product of disjoint cycles. 


Lemma 6.1.5. The number of conjugacy classes in S, 1s equal to the 
number of partitions of 7. 
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Clearly, from the results above, any two 3-cycles in S, are conjugate in 
S,. A 3-cycle is an even permutation, so is in A,,. One might wonder if any 
two 3-cycles are actually conjugate in the smaller group A,,. For n = 5 the 
answer is “yes,” and is quite easy to prove. 


Lemma 6.1.6. If = 5, then any two 3-cycles in S, are already conju- 
gate in A,. 


Proof. Let a, and a, be two 3-cycles in S,; by Lemma 6.1.4 they are 
conjugate in S,,. By renumbering, we may assume that a, = (123) and a, = 
7(123)7~' for some 7 € S,,. If 7 is even, then we are done. If 7 is odd, then 
p = 7(45) is even and p(123)p | = 7(45)(123)(45)'r>! = 7(123)7°! = op. 
Therefore, o, and a are conjugate in A,. We thus see that the lemma is cor- 
rect. LJ 


In $3 the two 3-cycles (123) and (132) are conjugate in S$; but are not 
conjugate in A3, which is a cyclic group of order 3. 

We now prove a result that is not only important in group theory, but 
also plays a key role in field theory and the theory of equations. 


Theorem 6.1.7. If n = 5, then the only nontrivial proper normal sub- 
group of S,, 1s A,,. 


Proof. Suppose that N is a normal subgroup of S,, and N is neither (e) 
nor S,. Let o # e be in N. Since the center of S,, is just (e) (See Problem 1) 
and the transpositions generate S,, there is a transposition 7+ such that 
ot # to. By Lemma 6.1.4, t; = ota ' is a transposition, so 77; = tata | #e 
is in N, since o € N and tot = tot ' € N because N is normal in S,. So N 
contains an element that is the product of two transpositions, namely 77). 

If + and 7, have a letter in common, then, as we saw in the proof of 
Lemma 6.1.1, 77, is a 3-cycle, hence N contains a 3-cycle. By Lemma 6.1.4 all 
3-cycles in S,, are conjugate to 77, so must fall in N, by the normality of N in 
S,- Thus the subgroup of S, generated by the 3-cycles, which, according to 
Theorem 6.1.2, is all of A,,, lies in N. Note that up to this point we have not 
used that n = 5. 

We may thus assume that 7 and 7, have no letter in common. Without 
loss of generality we may assume that 7 = (12) and 7, = (34); therefore, 
(12)(34) is in N. Since n = 5, (15) is in S, hence (15)(12)(34)(15)7! = 
(25)(34) is also in N; thus (12)(34)(25)(34) = (125) is in N. Thus in this case 
also, N must contain a 3-cycle. The argument above then shows that N D A,. 

We have shown that in both cases N must contain A, Since there are 
no subgroups strictly between A, and S, and N # S,, we obtain the desired 
result that N = A,,. UJ 
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The result is false for n = 4; the subgroup 
N = {e, (12)(34), (13)(24), (14)(23)} 


is a proper normal subgroup of S, and is not A,. 

We now know all the normal subgroups of S,, when n = 5. Can we de- 
termine from this all the normal subgroups of A,, for n = 5? The answer is 
“ves”; aS we Shall soon see, A,, is a simple group if n = 5. The proof we give 
may strike many as strange, for it hinges on the fact that 60, the order of As, 
is not a perfect square. 


Theorem 6.1.8. The group A; is a simple group of order 60. 


Proof. Suppose that A; is not simple; then it has a proper normal 
subgroup N whose order is as small as possible. Let the subset T = 
{o € S;| No ' CN}, the normalizer of N in S;. Since N is normal in As, we 
know that TD A,. Tis a subgroup of S;, so if T # A;, we would have that 
T = §,. But this would tell us that N is normal in S,, which, by Theorem 
6.1.7, would imply that NV D As, giving us that N = A,, contrary to our suppo- 
sition that N is a proper subgroup of A,;. So we must have T = As. Since (12) 
is odd, it is not in A;, hence is not in 7. Therefore, M = (12)N(12) | #N. 

Since N <] A;, we also have that M <J A, (Prove!), thus both MM N 
and MN = {mn|m © M,n € N} are normal in A;. (See Problem 9.) Because 
M # Nwe have that MM N # N, and since N is a minimal proper normal 
subgroup of A;, it follows that MM N = (e). On the other hand, 
(12)MN(12)"' = (12)M(12)~'(12)N(12)"' = NM (since (12)N(12)"' = M 
and (12)M(12)"' = N) = MN by the normality of M and N in A,. There- 
fore, the element (12) is in the normalizer of MN in S,, and since MN is nor- 
mal in A5, we get, as we did above, the MN is normal in $;, and so MN = A, 
by Theorem 6.1.7. 

Consider what we now have. Both M and N are normal subgroups of 
As, each of order |N|, and MN = A, and MN N = (e). We claim, and leave 
to the reader, that MN must then have order |N|*. Since MN = A., we ob- 
tain that 60 = |A.;| = |MN| = |N|’. But this is sheer nonsense, since 60 is not 
the square of any integer. This establishes Theorem 6.1.8. L] 


To go from the simplicity of A; to that of A, for n = 5 1s not too hard. 
Note that the argument we gave for A, did not depend on 5 until the punch 
line “60 is not the square of any integer.” In fact, the reasoning is valid as 
long as we know that n!/2 is not a perfect square. Thus, for example, if n = 6, 
then 6!/2 = 360 is not a square, hence A, is a simple group. Since we shall 
need this fact in the subsequent discussion, we record it before going on. 
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Corollary to the Proof of Theorem 6.1.8. A, is a simple group. 


We return to the question of whether or not n!/2 is a square. As a mat- 
ter of fact, it is not if n > 2. This can be shown as a consequence of the beau- 
tiful theorem in number theory (the so-called Bertrand Postulate), which as- 
serts that for m > 1 there is always a prime between m and 2m. Since we do 
not have this result at our disposal, we follow another road to show the sim- 
plicity of A,, for alln = 5. 

We now prove this important theorem. 


Theorem 6.1.9. For all n = 5 the group A, is simple. 


Proof. By Theorem 6.1.8 we may assume that n = 6. The center of A, 
for n > 3 is merely (e). (Prove!) Since A, is generated by the 3-cycles, if 
o #eisinA,, then, for some 3-cycle 7, ot # To. 

Suppose that N # (e) is a normal subgroup of A,, and that o # e isin N. 
Thus, for some 3-cycle 7, ot # ta, which is to say, ora” 't' # e. Because 
N is normal in A,, the element ta~'t~! is in N, hence ora '7' is also in N. 
Since 7 is a 3-cycle, so must ata ' also be a 3-cycle. Thus N contains the 
product of two 3-cycles, and this product is not e. These two 3-cycles involve 
at most six letters, so can be considered as sitting in A, which, since n = 6, 
can be considered embedded isomorphically in A,. (Prove!) But then 
N 1 A. # (e) is a normal subgroup of A¢, so by the Corollary above, 
N11 A, = Ag. Therefore, N must contain a 3-cycle, and since all 3-cycles are 
conjugate in A,, (Lemma 6.1.6), N must contain all the 3-cycles in S,. Since 
these 3-cycles generate A,,, we obtain that N 1s all of A,,, thereby proving the 
theorem. L] 


There are many different proofs of Theorem 6.1.9—they usually in- 
volve showing that a normal subgroup of A,, must contain a 3-cycle—which 
are shorter and possibly easier than the one we gave. However, we like the 
bizarre twist in the proof given in that the whole affair boils down to the fact 
that 60 is not a square. We recommend to the reader to look at some other 
proofs of this very important theorem, especially in a book on group theory. 

The A,, provide us with an infinite family of finite simple groups. There 
are several other infinite families of finite simple groups and 26 particular 
ones that do not belong to any infinite family. This determination of all 
finite simple groups, carried out in the 1960s and 1970s by a large number of 
group theorists, is one of the major achievements of twentieth-century math- 
ematics. 


Sec. 2 Finite Fields | 221 
PROBLEMS 


*1. Prove that if n > 2, the center of S, is (e). 
*2. Prove that if n > 3, the center of A,, Is (e). 
3. What can you say about the cycle structure of the product of two 
3-cycles? 
4. If m <n, show that there is a subgroup of S, isomorphic to S,,. 


5. Show that an abelian group having no proper subgroups is cyclic of 
prime order. 


6. How many conjugacy classes are there in S,? 


7. If the elements a,,a,,..., a, generate the group G and 5 is a noncentral 
element of G, prove that ba; # a,b for some 1. 


8. If M <] Nand N <1G, show that aMa“! is normal in N for every a € G. 
9. If M< Gand N <IG, show that MN is a normal subgroup of G. 
10. If n = 5 1s odd, show that the n-cycles generate A,,. 


11. Show that the centralizer of (12 --- k) in S, has order k(n — k)! and that 
(12 ---k) has n!/(k(n — k)!) conjugates in S,,. 
12. In the proof of Theorem 6.1.8, show that |MN| = |N|’. 


2. FINITE FIELDS | 


Our goal in this section and the next two is to get a complete description of 
all finite fields. What we shall show is that the multiplicative group of 
nonzero elements of a finite field is a cyclic group. This we do in this section. 
In the next two, the objectives will be to establish the existence and unique- 
ness of finite fields having p” elements for any prime p and any positive inte- 
ger n. 

Some of the things we are about to do already came up in the problem 
sets in group theory and field theory as hard problems. The techniques that 
we use come from group theory and field theory, with a little number theory 
thrown in. 

We recall what the Euler g-function is. We define the Euler ¢-function 
by: g(1) = 1 and, for n > 1, y(n) is the number of positive integers less than 
n and relatively prime to n. 

We begin with a result in number theory whose proof, however, will ex- 
ploit group theory. Before doing the general case, we do an example. 

Let n = 12; then g(12) = 4, for only 1,5, 7, and 11 are less than 12 and 
relatively prime to 12. We compute g(d) for all the divisors of 12. We have: 
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(1) = 1, o(2) = 1, v(3) = 2, v(4) = 2, o(6) = 2, and g(12) = 4. Note that 
the sum of all g(d) over all the divisors of 12 is 12. This is no fluke but is a 
special case of 


Theorem 6.2.1. Ifn = 1, then >¢g(d) = n, where this sum runs over all 
divisors d of n. 


Proof. Let G be a cyclic group of order n generated by the element a. 
If d|n, how many elements of G have order d? If b = a”, then all the solu- 
tions in G of x“ = e are the powers e, b, b*,..., b*”' of b. How many of 
these have order d? We claim, and leave to the reader, that b’ has order d if 
and only if r is relatively prime to d. So the number of elements of order d in 
G, for every divisor d of n, is g(d). Every element in G has order some divi- 
sor of n, so if we sum up the number of elements of order d—namely ¢(d)— 
over all d dividing n, we account for each element of G once and only once. 
Hence X¢(d) = nif we run over all the divisors d of n. The theorem is now 
proved. [1] 


In a finite cyclic group of order n the number of solutions of x“ = e, the 
unit element of G, is exactly d for every d that divides n. We used this fact in 
the proof of Theorem 6.2.1. We now prove a converse to this, getting thereby 
a criterion for cyclicity of a finite group. 


Theorem 6.2.2. Let G be a finite group of order n with the property 
that for every d that divides n there are at most d solutions of x“ = e in G. 
Then G 1s a cyclic group. 


Proof. Let (d) be the number of elements of G of order d. By hy- 
pothesis, if a € G is of order d, then all the solutions of x“ = e are the dis- 
tinct powers e, a, a*,..., a*~', of which number, ¢(d) are of order d. So if 
there is an element of order d in G, then w(d) = ¢g(d). On the other hand, if 
there is no element in G of order d, then #(d) = 0. So for all d|n we have 
that W(d) = ¢y(d). However, since every element of G has some order d that 
divides n we have that {y(d) = n, where this sum runs over all divisors d of 
n. But 


n = YW(d) S X¢e(d) =n 


since each w(d) = ¢(d). This gives us that =w(d) = X¢(d), which, together 
with W(d) = ¢y(d), forces w(d) = ¢g(d) for every d that divides n. Thus, in 
particular, y(n) = g(n) = 1. What does this tell us? After all, y(n) is the 
number of elements in G of order n, and since s(n) = 1 there must be an ele- 
ment a in G of order n. Therefore, the elements e, a, a*,..., a” ' are all dis- 
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tinct and are n in number, so they must give all of G. Thus G is cyclic with a 
as generator, proving the theorem. L] 


Is there any situation where we can be sure that the equation x“ = e 
has at most d solutions in a given group? Certainly. If K* is the group of 
nonzero elements of a field under multiplication, then the polynomial x” — 1 
has at most 7 roots in K* by Theorem 5.6.2. So, if G C K®* is a finite multi- 
plicative subgroup of K*, then the number of solutions of x“ = 1 in G is at 
most d for any positive integer d, so certainly for all d that divide the order 
of G. By Theorem 6.2.2 G must be a cyclic group. We have proved 


Theorem 6.2.3. If K is a field and K* is the group of nonzero ele- 
ments of K under multiplication, then any finite subgroup of K* is cyclic. 


A very special case of Theorem 6.2.3, but at the moment the most im- 
portant case for us, is 


Theorem 6.2.4. If K is a finite field, then K* is a cyclic group. 


Proof. K* is a finite subgroup of itself, so, by Theorem 6.2.3, K* is 
cyclic. ] 


A particular instance of Theorem 6.2.4 is of great importance in num- 
ber theory, where it is known as the existence of primitive roots mod p for p a 
prime. 


Theorem 6.2.5. If p is a prime, then ZL is a cyclic group. 


PROBLEMS 


1. If a € Ghas order d, prove that a’ also has order d if and only if r and d 
are relatively prime. 

2. Find a cyclic generator (primitive root) for Z/}\. 

3. Do Problem 2 for Z;.. 

4. Construct a field K having nine elements and find a cyclic generator for 
the group K*. 

5. If p is a prime and m = p’, then Z,, is not a field but the elements 
{{a] | (a, p) = 1} form a group under the multiplication in Z,,. Prove that 
this group is cyclic of order p(p — 1). 
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6. Determine all the finite subgroups of C*, where C is the field of complex 
numbers. 


In the rest of the problems here ¢ will be the Euler ¢-function. 


7. If p is a prime, show that o(p”) = p” '(p — 1). 
8. If mand are relatively prime positive integers, prove that 


g(mn) = 9(m)¢(n). 
9. Using the result of Problems 7 and 8, find ¢(n) in terms of the factoriza- 
tion of n into prime power factors. 
10. Prove that lim ¢(n) = ~. 


3. FINITE FIELDS Il: EXISTENCE 


Let K be a finite field. Then K must be of characteristic p, p a prime, and K con- 
tains 0,1, 2,...,p — 1, the p multiples of the unit element 1 of K. So K > Z,, 
or, more precisely, K contains a field isomorphic to Z,. Since K is a vector 
space over Z, and clearly is of finite dimension over Z,, if [K: Z,] = n, then 


K has p” elements. To see this, let v,, v2,..., U, be a basis of K over Z,. Then 
for every distinct choice of (a, a,..., a), where the a; are in Z,, the elements 
QU, + @yU2 ee a,U, 
are distinct. Thus, since we can pick (a), a@,..., a@,) in p” ways, K has p” 

elements. 


The multiplicative group K* of nonzero elements of K is a group of 
order p” — 1. So, we have that a”~' = 1, where m = p", for every a in K, 
hence a” = a. Since this is also obviously true for a = 0, we have that a” = a 
for every a in K. Therefore, the polynomial x” — x in Z,[x] has m = p” dis- 
tinct roots in K, namely all the elements of K. Thus x” — x factors in K[x] as 


x™ — x = (x — ay)(x — &)+**(X — ay), 


where a,,4>,...,4a,, are the elements of K. 

Everything we just said we already said, in more or less the same way, 
in Section 6 of Chapter 5. Since we wanted these results to be fresh in the 
reader’s mind, we repeated this material here. 

We summarize what we just did in 


Theorem 6.3.1. Let K be a finite field of characteristic p, p a prime. 
Then K contains m = p” elements where n = [K: Z,], and the polynomial 
x™ — x in Z,[x] splits into linear factors in K[x] as 
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m 


x™ — x = (x — a(x — a) +++ — ay), 


where @,,a@,,...,4,, are the elements of K. 
Two natural questions present themselves: 


1. For what primes p and what integers n does there exist a field having p” 
elements? 


2. How many nonisomorphic fields are there having p” elements? 


We shall answer both questions in this section and the next. The an- 
swers will be 


1. For any prime p and any positive integer n there exists a finite field 
having p” elements. 


2. Two finite fields having the same number of elements are isomorphic. 


It is to these two results that we now address ourselves. First, we settle 
the question of the existence of finite fields. We begin with a general remark 
about irreducible polynomials. 


Lemma 6.3.2. Let F be any field and suppose that p(x) is an irre- 
ducible polynomial in F[x]. Suppose that q(x) in F[x] is such that in some ex- 
tension field of F, p(x) and q(x) have a common root. Then p(x) divides q(x) 
in F[x]. 


Proof. Suppose that p(x) does not divide q(x); since p(x) 1s irreducible 
in F[x], p(x) and q(x) must therefore be relatively prime in F[x]. Thus there 
are polynomials u(x) and v(x) in F[x] such that 


u(x)p(x) + v(x)q(x) = 1. 


Suppose that the element a in some extension K of F is a root of both p(x) 
and q(x); thus p(a) = q(a) = 0. But then 1 = u(a)p(a) + v(a)q(a) = 0, a 
contradiction. So we get that p(x) divides q(x) in F[x]. O 


Note that we can actually prove a little more, namely 


Corollary. If f(x) and g(x) in F(x) are not relatively prime in K[x], 
where K is an extension of F, then they are not relatively prime in F[x]. 


Let F be a field of characteristic p # 0. We claim that the polynomial 
f(x) = x” — x, where m = p”, cannot have a multiple root in any extension 
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field K of F. Do you remember what a multiple root of a polynomial is? We 
refresh your memory. If g(x) is in F[x] and if K is an extension field of F, then 
ain K is a multiple root of g(x) if g(x) = (x — a)’q(x) for some q(x) in K[x]. 

We return to the polynomial f(x) = x” — x above. Since f(x) = 
x(x”~! — 1) and 0 is not a root of x”~' — 1, it is clearly true that 0 is a simple 
(i.e., not multiple) root of f(x). Suppose that a € K, K D F, is a root of f(x); 
thus a” = a. If y = x — a, then 


f(y) =y"™-y = (x - a)" — (« - a) =x" - a&k" — (x - a) 
(since we are in characteristic p # 0 and m = p") 
= x” — x (because a” = a) = f(x). 
SO 
f(x) =f) =y"™—y = (x — a)” — (x - @) 
= (x — a)((x — a)""" — 1), 
and clearly this is divisible by x — a only to the first power, since x — a does 


not divide (x — a)”"' — 1. So ais not a multiple root of f(x). 
We have proved 


Theorem 6.3.3. If n > 0, then f(x) = x” — x, where m = p”, has no 
multiple roots in any field of characteristic p. 


We should add a word to the proof above to nail down the statement of 
Theorem 6.3.3 as we gave it. Any field of characteristic p # 0 is an extension 
of Z,, and the polynomial f(x) is in Z,[x]. So the argument above, with K 
any field of characteristic p and F = Z,, proves the theorem in its given form. 

We have exactly what we need to prove the important 


Theorem 6.3.4. For any prime p and any positive integer n there ex- 
ists a finite field having p” elements. 


Proof. Consider the polynomial x” — x in Z,[x], where m = p”. By 
Theorem 5.6.6 there exists a finite extension K of Z, such that in K[x] the 
polynomial x” — x factors as 


== aye = i) C= ay), 


where aj, a),...,4a,, are in K. By Theorem 6.3.3, x” — x does not have any 
multiple roots in K, hence the elements a,, a,,...,a,, are m = p” distinct el- 
ements. We also know that a,, a,,..., a,, are all the roots of x” — x in K, 
since x” — x is of degree m. 
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Let A = {a € K | a” = a}; as we just saw, A has m distinct elements. We 
claim that A is a field. If a, b € A, then a” = a and b” = b, hence (ab)” = 
ab” = ab, thus ab € A. Because we are in characteristic p # 0 and m = p”, 
(a+ b)"=a"+b"™=a+b,hencea+t bisin A. 

Since A is a finite subset of a field and is closed with respect to sum and 
product, A must be a subfield of K. Since A has m = p” elements, A is thus 
the field whose existence was asserted in the statement of the theorem. With 
this the theorem is proved. [] 


PROBLEMS 


*1. Give the details of the proof of the Corollary to Lemma 6.3.2. 


The next two problems are a repeat of ones given earlier in the book. 


2. If f(x) = agx" + ayx""! + --- + a, is in F[x], let f'(x) be the formal 
derivative of f(x) defined by the following equation: f'(x) = napx"'! 
+ (n—1)ayx" 7? +---+(n—- dax""'+---+ a, _,. Prove that: 
(a) (f(x) + g(x)’ =f") + 8") 
(b) (f(x)g(x))' = f'@)g() + f(x)g'(x) for all f(x) and g(x) in F[x]. 

*3, Prove that f(x) in F[x] has a multiple root in some extension of F if and 

only if f(x) and f’(x) are not relatively prime. 

4. If f(x) = x” — x is in F[x], prove that f(x) does not have a multiple root in 
any extension of F if F is either of characteristic 0 or of characteristic 
p # 0, where p does not divide n — 1. 


mn 


. Use the result of Problem 4 to give another proof of Theorem 6.3.3. 


o 


If F is a field of characteristic p # 0, construct a polynomial with multiple 
roots of the form x” — x, where p | (n — 1). 
7. If K is a field having p” elements, show that for every m that divides n 
there is a subfield of K having p” elements. 


4. FINITE FIELDS tl: UNIQUENESS 


Now that we know that finite fields exist having p” elements, for any prime p 
and any positive integer n, we might ask: How many finite fields are there 
with p” elements? For this to make any sense at all, what we are really asking 
is: How many distinct nonisomorphic fields are there with p” elements? The 
answer to this is short and sweet: one. We shall show here that any two finite 
fields having the same number of elements are isomorphic. 
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Let K and L be two finite fields having p” elements. Thus K and L are 
both vector spaces of dimension n over Z,. As such, K and L are isomorphic 
as vector spaces. On the other hand, K* and L* are both cyclic groups of 
order p” — 1 by Theorem 6.2.4; hence K* and L* are isomorphic as multi- 
plicative groups. One would imagine that one could put these two isomor- 
phisms together to prove that K and L are isomorphic as fields. But it just 
isn’t so. The proof does not take this direction at all. But the finiteness of K 
and L together with these two isomorphisms (of two structures carried by K 
and L) do suggest that, perhaps, K and L are isomorphic as fields. This is in- 
deed the case, as we now proceed to show. 

We begin with 


Lemma 6.4.1. If q(x) in Z,[x] is irreducible of degree n, then 
q(x) | (x™ — x), where m = p". 


Proof. By Theorem 4.5.11 the ideal (q(x)) of Z,[x] generated by q(x) is 
a maximal ideal of Z,[x] since q(x) is irreducible in Z,[x]. Let A = 
Z ,[x]/(q(x)); by Theorem 4.4.3, A is a field of degree n over Z,, hence has p” 
elements. Therefore, u” = u for every element u in A. 

Let a = x + (q(x)) be the coset of x in A = Z,[x]/(q(x)); thus q(a) = 0 
and q(x) is the minimal polynomial for a over Z,. Since a is in A, a” = a, soa 
is seen as a root of the polynomial x” — x, where m = p”. Thus x” — x and 
q(x) have a common root in A. By Lemma 6.3.2 we have that q(x) | (x” — x). 0 


We are now in a position to prove the main result of this section. 


Theorem 6.4.2. If K and L are finite fields having the same number of 
elements, then K and L are isomorphic fields. 


Proof. Suppose that K and L have p” elements. By Theorem 6.2.4, L* 
is a cyclic group generated, say, by the element b in L. Then, certainly, 
Z,(b)—the field obtained by adjoining b to Z,—is all of L. Since 
[L:Z,] = n, by Theorem 5.3.2 b is algebraic over Z, of degree n, with 
n = deg(q(x)), where q(x) is the minimal polynomial in Z,[x] for b, and is ir- 
reducible in Z,[x]. 

The mapping 4: Z,[x] > L = Z,(b) defined by #(f(x)) = f(b) is a 
homomorphism of Z,[x] onto L with kernel (q(x)), the ideal of Z,[x] gener- 
ated by q(x). So L = Z,[x]/(q(2)). 

Because q(x) is irreducible in Z,[x] of degree n, by Lemma 6.4.1 q(x) 
must divide x” — x, where m = p”. However, by Lemma 6.3.1, the polyno- 
mial x” — x factors in K[x] as 


PRS a)e aa) =G,), 
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where 4), @,..., a,, are all the elements of K. Therefore, q(x) divides 
(x — a,)(x — az)-++ (x — a,,). By the Corollary to Theorem 4.5.10, q(x) can- 
not be relatively prime to all the x — a; in K[x], hence for some j, q(x) and 
x — a; have a common factor of degree at least 1. In short, x — a; must 
divide q(x) in K[x], so g(x) = (x — a,)h(x) for some h(x) in K[x]. There- 
fore, q(a;) = 0. 

Since q(x) is irreducible in Z,[x] and a; is a root of q(x), q(x) must be 
the minimal polynomial for a; in Z,[x]. Thus Z,(a;) = Z,[x]/(q(x)) = L. This 
tells us, among other things, that we have [Z,(a,;):Z,] = n, and since 
Z,(a;) C K and [K:Z,] = n we conclude that Z,(a;) = K. Therefore, 
K = Z,(a;) = L. Thus we get the result that we are after, namely, that K and 
L are isomorphic fields. This proves the theorem. [1] 


Combining Theorems 6.3.4 and 6.4.2, we have 


Theorem 6.4.3. For any prime p and any positive integer n there ex- 
ists, up to isomorphism, one and only one field having p” elements. 


5. CYCLOTOMIC POLYNOMIALS 


Let C be the field of complex numbers. As a consequence of De Moivre’s 
Theorem the complex number 0, = cos27a/n + i sin 27/n satisfies 07 = 1 
and 07 # 1if0<m<n. We called 6, a primitive nth root of unity. The other 
primitive nth roots of unity are 


x = cos( 22 + isin( 224) 
n n 


where (k,n) =l1and1=k<n. 

Clearly, 0, satisfies the polynomial x” — 1 in Q[x], where @ is the field 
of rational numbers. We want to find the minimal (monic) polynomial for 6, 
over @. 

We define a sequence of polynomials inductively. At first glance they 
might not seem relevant to the question of finding the minimal polynomial 
for 6, over Q. It will turn out that these polynomials are highly relevant to 
that question for, as we shall prove later, the polynomial @¢,(x) that we are 
about to introduce is a monic polynomial with integer coefficients, is irre- 
ducible over Q, and, moreover, @,,(6,,) = 0. This will tell us that ¢,(x) is the 
desired monic minimal polynomial for 6, over Q. 

We now go about the business of defining these polynomials. 
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Definition. The polynomials ¢,(x) are defined inductively by: 
(a) $\(x%) =x —1. 
(b) If nm > 1, then ¢,(x) = (x” — 1)/Il¢,(x), where in the product in the 
denominator d runs over all the divisors of n except for n itself. 
These polynomials are called the cyclotomic polynomials and @,,(x) is 
called the nth cyclotomic polynomial. 


At the moment it is not obvious that the ¢,(x) so defined are even 
polynomials, nor do we, as yet, have a clue as to the nature of the coefficients 
of these polynomials @¢,,(x). All this will come in due time. But first we want 
to look at some early examples. 


Examples 


1. h2(x) =? — 1I/bi(%) = - Dia -1=x +1. 
2. b3(x) = (x° — 1)/b, (x) = (0? — Ise - 1) = x? +x 41. 
3. p4(x) = (x* — 1)/(bi%)b2(x)) = GT ~- DM - YO + D) = 
(x* — 1)((x? -— 1) =x? +1. 
4. bs(x) = (x° — 1)/6, (x) = 2? - 1% - 1) =H xt + x8 tx2 4x41. 
5. h(x) = ees en ee 
. h(x) b2(x) b3(x) 
= ae | 
(x — 1)(x + 1)? +x + 1) 


We notice a few things about the polynomials above: 


1. They are all monic polynomials with integer coefficients. 

2. The degree of ¢,(x) is g(n), where ¢ is the Euler ¢-function, for 
1 =n <6. (Check this out.) 

3. Each @,(x), for 1 =n S6, is irreducible in Q(x). (Verify!) 

4. For1 =n <6, 6, is a root of ¢,(x). (Verify!) 


These few cases give us a hint as to what the general story might be for 
all ¢,,(x). A hint, yes, but only a hint. To establish these desired properties 
for ¢,,(x) will take some work. 

To gain some further insight into these polynomials, we consider a par- 
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ticular case, one in which n = p”, where p is a prime. To avoid cumbersome 
subscripts, we shall denote ¢,(x) by w*”)(x), where n = p™. The prime p will 
be kept fixed in the discussion. We shall obtain explicit formulas for the 
ys” (x)’s and determine their basic properties. However, the method we use 
will not be applicable to the general case of ¢,(x). To study the general sit- 
uation will require a wider and deeper set of techniques than those needed 
for p(x). 

We note one simple example. If p is a prime, the only divisor of p that 
is not p itself is 1. From the definition of ¢,(x) = w(x) we have that 


(1) = _ xP bal 1 
YC) = b (0) = = 
Note that in studying the Eisenstein Criterion we showed that this polyno- 
mial is irreducible in Q(x). 

What can we say for the higher p”) (x)’s? 


= xP tb eeee txt, 


Lemma 6.5.1. For all m = 1, 


(m ) iio 1 ane m1 2pm-1 (p—1)p™-1 
uy (x) = Spt = 1+ x? x Te en a 


Proof. We go by induction on m. 

If m = 1, we showed above that w(x) = (x? — 1)/(x -1)=1+x+ 
x? +++++x?7! so the lemma is true in this case. 

Suppose that w) = (x?" — 1)/(x?"' — 1) for all r < m. Consider 
ys) (x). Since the only proper divisors of p” are 1, p, p’,..., p”~', from the 
definition of p(x) we have that 

= = Sit | 
WO) = GPR) TM 
By induction, w(x) = (x?" — 1)/(x?"~" — 1) for r < m, hence 


(x — Lp? x) per) 


But then 

(my) = xP a1 
p (x) xP fe 1 
completing the induction and proving the lemma. L] 


Note here that 


Gti ee 
us (x) ~~ ye”! = 1 


me le eR EY ats Saco Shee CPO 
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is a monic polynomial with integer coefficients. Its degree is clearly p” '(p — 1), 
which is indeed g(p”). Finally, if 0 is a primitive pth root of unity, then 
9°” = 1, but 6°” # 1, hence w” (6) = 0; so 0 is a root of py” (x). The final 
thing we want to know is: Is (x) irreducible over Q? 

Note that 


y”)(x) = 1 + xP" +eee + xP Dp! = yO") 


and we know that )(x) is irreducible in Q[x]. We shall use the Eisenstein 
Criterion to prove that y”)(x) is irreducible in Q[x]. 

We digress for a moment. If f(x) and g(x) are two polynomials with in- 
teger coefficients, we define f(x) = g(x) mod p if f(x) = g(x) + pr(x), where 
r(x) is a polynomial with integer coefficients. This is equivalent to saying that 
the corresponding coefficients of f(x) and g(x) are congruent mod p. Ex- 
panding (f(x) + g(x))? by the binomial theorem, and using that all the bino- 
mial coefficients are divisible by p, since p is a prime, we arrive at 
(F(x) + g(x))? = f(x)? + g(x)? mod p. 

Given f(x) = agx" + a,x""'+.---+ a,, where the a; are integers, then, 
by the above, 


f(xy = (aox” + aux” + +++ +a, = abx” + alxr-De +... + a? 
= aAyx” + ax lP +--+ + a, mod p, 


the latter congruence being a consequence of Fermat’s Theorem (the Corol- 
lary to Theorem 2.4.8). Since f(x?) = agx”? + a," )? + +--+ a,, we obain 
that 


f (x?) = f(x)? mod p. 
Iterating what we just did, we arrive at 
f (x?") = f(x)?" mod p 


for all nonnegative k. 
We return to our p(x). Since p(x) = w(x?""") we have, from the 
discussion above, that yt” (x) = y)(x?""') mod p. Therefore, 


m—-1 m-1 
(1) m1 _ (ise hee Pe Caos) ee 
ese tay (aa x 
= ta + pxP-2 + PPD) ps Ss ag ob Pe) " p) 


= yx)?" "2D mod p = w(x + 1) mod p. 
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This tells us that 
po Go dy Se Od r(x). 


where r(x) is a polynomial with integer coefficients. So all the coefficients of 
w(x + 1), with the exception of its leading coefficient 1, are divisible by p. 
If we knew for some reason that the constant term of h(x) = (x + 1) was 
not divisible by p’, we could apply the Eisenstein Criterion to show that h(x) 
is irreducible. But what is the constant term of h(x) = w(x + 1)? It is 
merely h(0) = p” (1), which, from the explicit form of p(x + 1) that we 
found four paragraphs earlier, is exactly p. Thus h(x) is irreducible in Q[x], 
that is, w(x + 1) is irreducible in Q[x]. But this immediately implies that 
yp” (x) is irreducible in Q[x]. 
Summarizing, we have proved 


Theorem 6.5.2. For n = p™, where p is any prime and m any nonneg- 
ative integer, the polynomial @,,(x) is irreducible in Q[x]. 


As we pointed out earlier, this is a very special case of the theorem we 
shall soon prove; namely, that @¢,,(x) is irreducible for all positive integers n. 
Moreover, the result and proof of Theorem 6.5.2 play no role in the proof of 
the general proposition that ¢,(x) is irreducible in Q[x]. But because of the 
result in Theorem 6.5.2 and the explicit form of ¢,(x) when n = p”™, we do 
get a pretty good idea of what should be true in general. We now proceed to 
the discussion of the irreducibility of ¢,,(x) for general n. 


Theorem 6.5.3. For every integer n = 1, 
d,, (x) = (x Ze 9) ioe % (x A gb"). 
where 0, 9, ..., 0%”) are the y(n) distinct primitive nth roots of unity. 

Proof. We proceed by induction on n. 

If mn = 1, then ¢,(x) = x — 1, and since 1 is the only first root of unity, 
the result is certainly correct in this case. 

Suppose that result is true for all m <n. Thus, if d|n and d # n, then, 
by the induction, ¢,(x) = (x — 0)) --- (x — 0%)), where the 6° are the 
primitive dth roots of unity. Now 

x —1L=(« — O)a- &) os &— &), 
where the ¢; run over all nth roots of unity. Separating out the primitive nth 
roots of unity in this product, we obtain 


x” — 1 = (x a 6) erase (x — AP) u(x), 
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where u(x) is the product of all the other x — @; thus by our induction hy- 
pothesis v(x) is the product of the ¢,(x) over all divisors d of n with the ex- 
ception of d = n. Thus, since 


= Sal). (x = 9) se (x - ae" u(x) 
n(X 
= es = ie: i 6) me (x- gown) 
we have proved the result claimed in the theorem. L] 


From the form of @¢,(x) in Theorem 6.5.3 we immediately see that 
¢,,(x) is a monic polynomial in C[x] of degree y(n). Knowing this, we prove 
that, in fact, the coefficients of ¢,,(x) are integers. Why is this true? Proceed- 
ing by induction on n, we may assume this to be the case if d|n and d # n. 
Therefore, if u(x) denotes the polynomial used in the proof of Theorem 
6.5.3, then (x” — 1)/uv(x) = ¢,(x) € C[x], hence v(x) | x" — 1 in C[x]. But, by 
the long-division process, dividing the monic polynomial v(x) with integer 
coefficients into x" — 1 leads us to a monic polynomial with integer coeffi- 
cients (and no remainder, since u(x) | (x” — 1) in C[x]). Thus (x” — 1)/v(x) = 
¢,,(x) is a monic polynomial with integer coefficients. As we saw, its degree is 
p(n). Thus 


Theorem 6.5.4. For every positive integer n the polynomial @,,(x) is a 
monic polynomial with integer coefficients of degree y(n), where ¢ is the 
Euler g-function. 


Knowing that @¢,,(x) is a polynomial, we can see that its degree is g(n) 
in yet another way. From @,(x) = (x” — 1)/v(x), using induction on n, 
deg(¢,,(x)) = n — deg(u(x)) = n — X¢—(d), the sum over all divisors d of n 
other than d = n, from the form of u(x). Invoking the result of Theorem 
6.2.1,n — Xy(d) = ¢(n), where again this sum is over all d|n, d # n. We 
thus obtain that deg(¢,,(x)) = o(n). 

The result we are about to prove is without question one of the most 
basic ones about cyclotomic polynomials. 


Theorem 6.5.5. For every positive integer n the polynomial @,,(x) is 
irreducible in Q[x]. 


Proof. Let f(x) in Q[x] be an irreducible polynomial such that 
f(x) | ¢,(x). Thus ¢, (x) = f(x)g(x) for some g(x) in Q[x]. By Gauss’ Lemma 
we may assume that both f(x) and g(x) are monic polynomials with integer 
coefficients, thus are in Z[x]. Our objective is to show that ¢,(x) = f(x); if 


Sec. 5 Cyclotomic Polynomials 235 


this were the case, then, since f(x) is irreducible in Q[x], we would have that 
@,,(x) is irreducible in Q[x]. 

Since ¢,(x) has no multiple roots, f(x) and g(x) must be relatively 
prime. Let p be a prime number such that p does not divide n. If 6 is a root 
of f(x), it is then a root of ¢,(x), hence by Theorem 6.5.3 @ is a primitive nth 
root of unity. Because p is relatively prime to n, 0? is also a primitive nth 
root of unity, thus, by Theorem 6.5.3, 6? is a root of ¢,(x). We therefore 
have that 0 = @,(0?) = f(0”)g(0"), from which we deduce that either f(0”) = 
0 or g(0”) = 0. 

Our aim is to show that f(@?) = 0. Suppose not; then g(0”) = 0, hence 
6 is a root of g(x”). Because @ is also a root of the irreducible polynomial 
f(x), by Lemma 6.3.2 we obtain that f(x) | g(x”). As we saw in the course of 
the proof of Theorem 6.5.2, g(x’) = g(x)? mod p. 

Let J be the ideal in Z generated by p; by the Corollary to Theorem 
4.6.2, Z[x]/J[x] = Z,[x], which means that reducing the coefficients of any 
polynomial mod p is a homomorphism of Z[x] onto Z,[x]. 

Since all the polynomials ¢,(x), u(x), f(x), and g(x) are in Z[x], if 
,(x), 0(x), f (x), and g(x) are their images in Z,[x], all the relations among 
them are preserved going mod p. Thus we have the relations x” — 1 = 
bn (x)0 (x), bn (x) =F(x)B (x) and f(x)| B(x”) = B(x)”. 

Therefore, f(x) and g(x) have a common root, a, in some extension K 
of Z,. Now x" — 1 = @, (x)0(x) =f (x)B (x), hence a, as a root of both f(x) 
and 2@(x), is a multiple root of x” — 1. But the formal derivative (x” — 1)’ of 
x" — 1 is nx"! # 0, since p does not divide n; therefore, (x” — 1)’ is rela- 
tively prime to x” — 1. By the result of Problem 3 of Section 3 the polyno- 
mial x” — 1 cannot have a multiple root. With this contradiction arrived at 
from the assumption that 6? was not a root of f(x), we conclude that when- 
ever @ is a root of f(x), then so must 6? be one, for any prime p that does not 
divide n. 

Repeating this argument, we arrive at: 0’ is a root of f(x) for every in- 
teger r that is relatively prime to n. But 6, as a root of f(x), is a root of ¢,,(x), 
SO is a primitive nth root of unity. Thus 6” is also a primitive nth root of unity 
for every r relatively prime to n. By running over all 7 that are relatively 
prime to n, we pick up every primitive nth root of unity as some such 6’. 
Thus all the primitive nth roots of unity are roots of f(x). By Theorem 6.5.3 
we see that ¢,,(x) = f(x), hence ¢,(x) is irreducible in Q[x]. LJ 


It may strike the reader as artificial and unnatural to have resorted to 
the passage mod p to carry out the proof of the irreducibility of a polynomial 
with rational coefficients. In fact, it may very well be artificial and unnatural. 
As far as we know, no proof of the irreducibility of ¢,(x) has ever been given 
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staying completely in Q[x] and not going mod p. It would be esthetically sat- 
isfying to have such a proof. On the other hand, this is not the only instance 
where a result is proved by passing to a related subsidiary system. Many the- 
orems in number theory—about the ordinary integers—have proofs that ex- 
ploit the integers mod p. 

Because @¢,,(x) is a monic polynomial with integer coefficients which is 
irreducible in Q[x], and since 6,,, the primitive nth root of unity, is a root of 
d,, (x), we have 


Theorem 6.5.6. ¢,,(x) is the minimal polynomial in Q[x] for the primi- 
tive nth roots of unity. 


PROBLEMS 


1. Verify that the first six cyclotomic polynomials are irreducible in Q[x] by a 
direct frontal attack. 


2. Write down the explicit forms of: 
(a) $10(%). 
(b) $15(x). 
(c) $20(x). 

3. If (x™ — 1)| (x" — 1), prove that m | n. 

4. If a > 1 is an integer and (a” — 1) | (a” — 1), prove that m | n. 

5. If K is a finite extension of Q, the field of rational numbers, prove that 
there is only a finite number of roots of unity in K. (Hint: Use the result of 
Problem 10 of Section 2, together with Theorem 6.5.6.) 


6. LIOUVILLE’S CRITERION 


Recall that a complex number is said to be algebraic of degree n if it is the 
root of a polynomial of degree n over Q, the field of rational numbers, and is 
not the root of any such polynomial of degree lower than n. In the terms used 
in Chapter 5, an algebraic number is a complex number algebraic over Q. 

A complex number that is not algebraic is called transcendental. Some 
familiar numbers, such as e, 7, e", and many others, are known to be tran- 
scendental. Others, equally familiar, such as e + 7, ez, and 7°, are suspected 
of being transcendental but, to date, this aspect of their nature is still open. 

The French mathematician Joseph Liouville (1809-1882) gave a crite- 
rion that any algebraic number of degree n must satisfy. This criterion gives 
us a condition that limits the extent to which a real algebraic number of de- 
gree n can be approximated by rational numbers. This criterion is of such a 


Sec. 6 Liouville’s Criterion 237 


nature that we can easily construct real numbers that violate it for every 
n > 1. Any such number will then have to be transcendental. In this way we 
shall be able to produce transcendental numbers at will. However, none of 
the familiar numbers is such that its transcendence can be proved using Liou- 
ville’s Criterion. 

In this section of the book we present this result of Liouville. It is a sur- 
prisingly simple and elementary result to prove. This takes nothing away 
from the result; in our opinion it greatly enhances it. 


Theorem 6.6.1 (Liouville). Let a be an algebraic number of degree 
n = 2 (ie., a is algebraic but not rational). Then there exists a positive con- 
stant c (which depends only on a) such that for all integers u, v with 
v >0,\|a — u/v| > c/v". 


Proof. Let a be a root of the polynomial f(x) of degree n in Q[x], 
where @ is the field of rational numbers. By clearing of denominators in the 
coefficients of f(x), we may assume that f(x) = rox” + ryx™ 1 +--+ +7, 
where all the 7; are integers and where ry > 0. 

Since the polynomial f(x) is irreducible of degree n it has n distinct 
roots ad = a;, 4,,..., a, in C, the field of complex numbers. Therefore, f(x) 
factors over C as f(x) = ro(x — a)(x — ay) ++: (x — a,). Let u, v be integers 
with v > 0; then 


n-1 
u\ rou" , mu r,—\u 
p(t) = Be pm 
hence 
n u cis m4 n~-l Sb: 3 ee n~-1 + n 
u"f Dp} Tol ru" "vu r,—jUU r,,U 


is an integer. Moreover, since f(x) is irreducible in Q[x] of degree n = 2, f(x) 
has no rational roots, so v"f(u/v) is a nonzero integer, whence | v"f(u/v) | = 1. 
Using the factored form of f(x), we have that 


fo) =e) Ml) -) >} 


u\ | | f(ulv)| 
v} “| rl(ulv) — ay] --- |(ulv) — a,| 
_ v"| f (ulv)| 
rou" |(u/v) — ag|--- |(u/v) — a,| 


1 
= rv" |(u/U) — az| ++ |(u/v) — a,\ 


hence 
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Let s be the largest of |a], |a,|,..., |a,,|. We divide the argument accord- 
ing as |u/v| > 2s or |u/v| S 2s. If |u/v| > 2s, then, by the triangle inequality, 
la — (u/v)| = |u/v| — |a| > 2s — s = s, and, since v = 1, |a — (u/v)| > s/v". 

On the other hand, if |u/v| <= 2s, then, again by the triangle inequality, 
la; — (u/v)| = |a;{ + |u/v| Ss s + 2s = 3s. Therefore, 


a) 


so that 1/t = 1/(3s)""' = 1/(3""'s"~'). Going back to the inequality above 
that Ja — (u/v)| = 1/[rov"|a, — (u/v)| --- ja, — (u/v)|], we have that 
la — (u/v)| = 1/(793"~!s"~1v"). These numbers ro, 3”"', s"~' are determined 
once and for all by a and its minimal polynomial f(x) and do not depend on u 
or v. If we let b = 1/(r)3"~'s""'), then b > 0 and Ja — (u/v)| > b/v". This 
covers the second case, where |u/v| S 2s. 

If c is a positive number smaller than both b and s, we have from the 
discussion that |Ja — u/v| > c/v" for all integers u, v, where v > 0, thereby 
proving the theorem. [] 


es e e 


< (3s)""", 


Let’s see the particulars of the proof for the particular case a = V2. 
The minimal polynomial for a in Q[x] is f(x) = (x — a)(x + a), so a = a, and 
—a = ay. So if u and vu are integers, and uv > 0, then 


-)-A}-e) =o) 3-2 


an integer. So |u?f(u/v)| = 1 = 1/v*. The s above is the larger of v2! 
and | — V2]; that is, s = V2. Also, the b above is 1/(377'(/2)27") = 1/(3V2), 
so if c is any positive number less than 1/(3V2), then |V2 — ulv| > c/v?. 

What the theorem says is the following: Any algebraic real number has 
rational numbers as close as we like to it (this is true for all numbers), but if 
this algebraic real number a is of degree n = 2, there are restrictions on how 
finely we can approximate a by rational numbers. These restrictions are the 
ones imposed by Liouville’s Theorem. 

How do we use this result to produce transcendental numbers? All we 
need do is to produce a real number 7, say, such that whatever positive inte- 
ger n may be, and whatever positive c we choose, we can find a pair of inte- 
gers u, v, with v > 0 such that |r — u/v| < c/v". We can find such a 7 easily by 
writing down an infinite decimal involving 0’s and 1’s, where we make the 0’s 
spread out between the 1’s very rapidly. For instance, if rt = 
0.10100100000010 ... 010 ..., where the 0’s between successive 1’s go like 
m!, then 7 is a number that violates Liouville’s Criterion for every n > 0. 
Thus this number 7 is transcendental. 
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We could, of course, use other wide spreads of 0’s between the 1’s— 
m”, (m!)’, and so on—to produce hordes of transcendental numbers. Also, 
instead of using just 1’s, we could use any of the nine nonzero digits to obtain 
more transcendental numbers. We leave to the reader the verification that 
the numbers of the sort we described do not satisfy Liouville’s Criterion for 
any positive integer n and any positive c. 

We can use the transcendental number 7 and the variants of it we de- 
scribed to prove a famous result due to Cantor. This result says that there is 
a 1-1 correspondence between all the real numbers and its subset of real 
transcendental numbers. In other words, in some sense, there are as many 
transcendental reals as there are reals. We give a brief sketch of how we 
carry it out, leaving the details to the reader. 

First, it is easy to construct a 1-1 mapping of the reals onto those reals 
strictly between 0 and 1 (try to find such a mapping). This is also true for the 
real transcendental numbers and those of them strictly between 0 and 1. Let 
the first set be A and the second one B,so B C A. Then, by a theorem in set 
theory, it suffices to construct a 1-1 mapping of A into B. 

Given any number in A, we can represent it as an infinite decimal 
0.a,;a,...a,..., Where the a; fall between 0 and 9. (We now wave our hands 
a little, being a little bit inaccurate. The reader should try to tighten up 
the argument.) Define the mapping f from A to B by f(0.a,a,...a,...) = 
0.a,0a,00a,000000a, ... ; by the Liouville Criterion, except for a small set of 
Q,,@,...,4,,..., the numbers 0.a,0a,00a,000000a, ... are transcendental. 
The f we wrote down then provides us with the required mapping. 

One final word about the kind of approximation of algebraic numbers 
by rationals expressed in Theorem 6.6.1. There we have that if a is real alge- 
braic of degree n = 2, then |a — u/v| > c/v" for some appropriate positive c. 
If we could decrease the n to |a — u/v| > c/v™ for m <n and some suitable c 
(depending on a and m), we would get an even sharper result. In 1955 the 
(then) young English mathematician K. F. Roth proved the powerful result 
that effectively we could cut the n down to 2. His exact result is: If a is alge- 
braic of degree n = 2, then for every real number r > 2 there exists a positive 
constant c, depending on a and r, such that |a — u/v| > c/v’ for all but a finite 
number of fractions w/v. 


7. THE IRRATIONALITY OF 7 
As we indicated earlier, Lindemann in 1882 proved that 7 is a transcendental 


number. In particular, from this result of Lindemann it follows that 77 is irra- 
tional. We shall not prove the transcendence of 7 here—it would require a 
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rather long detour—but we will, at least, prove that 7 is irrational. The very 
nice proof that we give of this fact is due to I. Niven; it appeared in his paper 
“A Simple Proof That 7 Is Irrational,” which was published in the Bulletin 
of the American Mathematical Society, vol. 53 (1947), p. 509. To follow 
Niven’s proof only requires some material from a standard first-year calculus 
course. 

We begin with 


Lemma 6.7.1. If u is a real number, then Jim u"ln! = 0. 


Proof. If u is any real number, then e” is a well-defined real number 
ande“=1+utu?/2!+u3/3!+---+u"n! +---. The series 1+ u + u7/2! + 
‘++ + u"/n! + +++ converges to e”; since this series converges, its nth term 
must go to 0. Thus lim u"/n! = 0.0 


We now present Niven’s proof of the irrationality of 77. 


Theorem 6.7.2. 77 is an irrational number. 


Proof. Suppose that 77 is rational; then 7 = a/b, where a and D are pos- 
itive integers. 

For every integer n > 0, we introduce a polynomial, whose properties 
will lead us to the desired conclusion. The basic properties of this polynomial 
will hold for all positive n. The strategy of the proof is to make a judicious 
choice of n at the appropriate stage of the proof. 

Let f(x) = x"(a — bx)"/n!, where a = a/b. This is a polynomial of de- 
gree 2n with rational coefficients. Expanding it out, we obtain 


+1 2n 
— AgX” + ayx™™ +:++ + ax 
f(x) = n! ram 


where 


Ay = a",a, = —na""'b,...,a; = "OEE eli ee Ay = (—1)"b" 

are integers. 

We denote the ith derivative of f(x) with respect to x by the usual no- 
tation f(x), understanding f(x) to mean f(x) itself. 

We first note a symmetry property of f(x), namely, that f(x) = 
f(a — x). To see this, note that f(x) = (b"/n!)x"(a — x)", from whose form it 
is clear that f(x) = f(a — x). Since this holds for f(x), it is easy to see, from 
the chain rule for differentiation, that f(x) = (-1)'f©(a — x). 
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This statement about f(x) and all its derivatives allows us to conclude 
that for the statements that we make about the nature of all the f (0), there are 
appropriate statements about all the f (7). 


We shall be interested in the value of f(0), and f(z), for all nonneg- 
ative i. Note that from the expanded form of f(x) given above we easily ob- 
tain that f(0) is merely i! times the coefficient of x' of the polynomial f(x). 
This immediately implies, since the lowest power of x appearing in f(x) is 
the nth, that f(0) = 0 if i <n. For i = n we obtain that f(0) = ila;_,/n!; 
since i = n, i!/n! is an integer, and as we pointed out above, a;_, is also an 
integer; therefore f“(0) is an integer for all nonnegative integers i. Since 
f(a) = (-1)f(0), we have that f(s) is an integer for all nonnegative 
integers i. 

We introduce an auxiliary function 


PA) = fe) Sf OG) bore Typ (x). 
Since f(x) = 0 if m > 2n, we see that 


TE = Fx) = f(x) — F(a) to + (HOH) 


= — F(x) + f(x). 


Therefore, 


- (F(x) sinx — F(x) cos x) = F(x) sinx + F(x) cos x 
— F(x) cos x + F(x) sin x 
= (F(x) + F(x)) sinx = f(x) sin x. 


From this we conclude that 


[ f(x) sin x dx = [F(x) sinx — F(x) cos x]¢ 
0 
= (F(a) sin 7 — F(a) cos 7) — (F'(0) sin 0 — F(O) cos 0) 
= F(a) + F(0). 
But from the form of F(x) above and the fact that all f(0) and f(7) are 
integers, we conclude that F(7r) + F(O) is an integer. Thus JG f(x) sin x dx is 
an integer. This statement about J§> f(x) sin x dx is true for any integer n > 0 


whatsoever. We now want to choose n cleverly enough to make sure that the 
statement “fj f(x) sin x dx is an integer” cannot possibly be true. 
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We carry out an estimate on f¢ f(x) sin x dx. For 0 < x < a the func- 
tion f(x) = x"(a — bx)"/n! S wr"a"/n! (since a > 0), and also 0 < sinx = 1. 
Thus 0 < f7 f(x) sin x dx < f7m"a"/n! dx = m"* a" Ini. 

Let u = 7a; then, by Lemma 6.7.1, Jim u"/n! = 0, so if we pick n large 
enough, we can make sure that u"/n! < 1/7, “hence ”*!a"/n! = au"In! <1. 
But then fj f(x) sin x dx is trapped strictly between 0 and 1. But, by what we 
have shown, {¢ f(x) sin x dx is an integer. Since there is no integer strictly be- 
tween 0 and 1, we have reached a contradiction. Thus the premise that 7r is 
rational was false. Therefore, 7 is irrational. This completes the proof of 
the theorem. (1 
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